🤖 AI Summary
When monolingual users cannot directly assess machine translation (MT) quality, how can quality feedback mechanisms support more rational decisions about sharing translations?
Method: We conducted a controlled, comparative study of four feedback modalities—error highlighting, LLM-generated explanations, back-translation, and a structured question-answer table—systematically distinguishing explicit versus implicit feedback.
Contribution/Results: All feedback types except error highlighting significantly improved decision accuracy and trust calibration. Implicit feedback—particularly the question-answer table—outperformed explicit feedback across accuracy, trust calibration, and user experience, while imposing the lowest cognitive load. Our study establishes a rigorously controlled, experimentally comparable paradigm, providing both theoretical grounding and practical design principles for trustworthy human–MT collaboration.
📝 Abstract
As people increasingly use AI systems in work and daily life, feedback mechanisms that help them use AI responsibly are urgently needed, particularly in settings where users are not equipped to assess the quality of AI predictions. We study a realistic Machine Translation (MT) scenario where monolingual users decide whether to share an MT output, first without and then with quality feedback. We compare four types of quality feedback: explicit feedback that directly give users an assessment of translation quality using 1) error highlights and 2) LLM explanations, and implicit feedback that helps users compare MT inputs and outputs through 3) backtranslation and 4) question-answer (QA) tables. We find that all feedback types, except error highlights, significantly improve both decision accuracy and appropriate reliance. Notably, implicit feedback, especially QA tables, yields significantly greater gains than explicit feedback in terms of decision accuracy, appropriate reliance, and user perceptions, receiving the highest ratings for helpfulness and trust, and the lowest for mental burden.