Translationese-index: Using Likelihood Ratios for Graded and Generalizable Measurement of Translationese

📅 2025-07-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the lack of quantitative, graded, and universally applicable metrics for measuring “translationese” in translated texts. We propose T-index, the first fully quantifiable metric for translationese assessment. T-index leverages a pair of contrastively fine-tuned language models (0.5B parameters each) and computes the log-likelihood ratio between synthetic and authentic translation data to enable continuous, cross-domain evaluation of translationese intensity. It captures both relative differences and absolute severity, exhibits low correlation with mainstream machine translation automatic metrics (e.g., BLEU, COMET), and thus provides independent, complementary value. Remarkably, T-index achieves effective modeling with only 1–5k synthetic samples. Experiments demonstrate strong agreement with human judgments (Pearson’s r = 0.568) and robust performance across diverse domains.

Technology Category

Application Category

📝 Abstract
In this paper, we propose the first quantitative measure for translationese -- the translationese-index (T-index) for graded and generalizable measurement of translationese, computed from the likelihood ratios of two contrastively fine-tuned language models (LMs). We use a synthesized dataset and a dataset with translations in the wild to evaluate T-index's generalizability in cross-domain settings and its validity against human judgments. Our results show that T-index is both robust and efficient. T-index scored by two 0.5B LMs fine-tuned on only 1-5k pairs of synthetic data can well capture translationese in the wild. We find that the relative differences in T-indices between translations can well predict pairwise translationese annotations obtained from human annotators; and the absolute values of T-indices correlate well with human ratings of degrees of translationese (Pearson's $r = 0.568$). Additionally, the correlation between T-index and existing machine translation (MT) quality estimation (QE) metrics such as BLEU and COMET is low, suggesting that T-index is not covered by these metrics and can serve as a complementary metric in MT QE.
Problem

Research questions and friction points this paper is trying to address.

Measure translationese quantitatively using likelihood ratios
Evaluate T-index generalizability and validity with datasets
Assess T-index correlation with human judgments and MT metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses likelihood ratios from fine-tuned LMs
Measures translationese with T-index robustly
Correlates well with human judgments effectively
🔎 Similar Papers
No similar papers found.
Yikang Liu
Yikang Liu
Shanghai Jiao Tong University
Computational Linguistics
W
Wanyang Zhang
School of Software and Microelectronics, Peking University
Y
Yiming Wang
Dept. of Computer Science and Engineering, Shanghai Jiao Tong University
Jialong Tang
Jialong Tang
Qwen Team, Alibaba
LLMNLP
P
Pei Zhang
Tongyi Lab
Baosong Yang
Baosong Yang
Alibaba-inc
Machine LearningLarge Language ModelMachine Translation
F
Fei Huang
Tongyi Lab
R
Rui Wang
Dept. of Computer Science and Engineering, Shanghai Jiao Tong University
Hai Hu
Hai Hu
City University of Hong Kong
computational linguisticsnatural language inferenceChinese linguisticscorpus annotation