Translationese-index: Using Likelihood Ratios for Graded and Generalizable Measurement of Translationese

📅 2025-07-16

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This paper addresses the lack of quantitative, graded, and universally applicable metrics for measuring “translationese” in translated texts. We propose T-index, the first fully quantifiable metric for translationese assessment. T-index leverages a pair of contrastively fine-tuned language models (0.5B parameters each) and computes the log-likelihood ratio between synthetic and authentic translation data to enable continuous, cross-domain evaluation of translationese intensity. It captures both relative differences and absolute severity, exhibits low correlation with mainstream machine translation automatic metrics (e.g., BLEU, COMET), and thus provides independent, complementary value. Remarkably, T-index achieves effective modeling with only 1–5k synthetic samples. Experiments demonstrate strong agreement with human judgments (Pearson’s r = 0.568) and robust performance across diverse domains.

Technology Category

Application Category

📝 Abstract

In this paper, we propose the first quantitative measure for translationese -- the translationese-index (T-index) for graded and generalizable measurement of translationese, computed from the likelihood ratios of two contrastively fine-tuned language models (LMs). We use a synthesized dataset and a dataset with translations in the wild to evaluate T-index's generalizability in cross-domain settings and its validity against human judgments. Our results show that T-index is both robust and efficient. T-index scored by two 0.5B LMs fine-tuned on only 1-5k pairs of synthetic data can well capture translationese in the wild. We find that the relative differences in T-indices between translations can well predict pairwise translationese annotations obtained from human annotators; and the absolute values of T-indices correlate well with human ratings of degrees of translationese (Pearson's $r = 0.568$). Additionally, the correlation between T-index and existing machine translation (MT) quality estimation (QE) metrics such as BLEU and COMET is low, suggesting that T-index is not covered by these metrics and can serve as a complementary metric in MT QE.

Problem

Research questions and friction points this paper is trying to address.

Measure translationese quantitatively using likelihood ratios

Evaluate T-index generalizability and validity with datasets

Assess T-index correlation with human judgments and MT metrics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses likelihood ratios from fine-tuned LMs

Measures translationese with T-index robustly

Correlates well with human judgments effectively

🔎 Similar Papers

No similar papers found.