🤖 AI Summary
Existing literary translation evaluation predominantly relies on automated metrics or subjective human ratings, often neglecting stylistic features and lacking systematic assessment of large language models’ (LLMs) literary translation capabilities. This study pioneers the systematic application of computational stylistics to evaluate LLMs’ stylistic fidelity in literary translation, specifically comparing GPT-4 against professional human translators in translating Chinese web novels into English. We quantitatively analyze lexical diversity, n-gram distributions, syntactic complexity, and semantic similarity. Statistical tests reveal no significant stylistic differences between GPT-4 and human translations across multiple dimensions, indicating GPT-4’s capacity to emulate human translators’ “humanistic sensibility.” The work not only demonstrates LLMs’ potential for modeling literary style but also advances machine translation evaluation paradigms by introducing stylistic rigor and blurring the stylistic boundary between human and machine translation.
📝 Abstract
Existing research indicates that machine translations (MTs) of literary texts are often unsatisfactory. MTs are typically evaluated using automated metrics and subjective human ratings, with limited focus on stylistic features. Evidence is also limited on whether state-of-the-art large language models (LLMs) will reshape literary translation. This study examines the stylistic features of LLM translations, comparing GPT-4's performance to human translations in a Chinese online literature task. Computational stylometry analysis shows that GPT-4 translations closely align with human translations in lexical, syntactic, and content features, suggesting that LLMs might replicate the 'human touch' in literary translation style. These findings offer insights into AI's impact on literary translation from a posthuman perspective, where distinctions between machine and human translations become increasingly blurry.