An Empirical Study of On-Device Translation for Real-Time Live-Stream Chat on Mobile Devices

📅 2026-01-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This study addresses the challenges of deploying real-time live chat translation models on mobile devices, particularly concerning CPU utilization, thermal management, and model selection. To this end, the authors introduce LiveChatBench—the first dedicated benchmark for this task—comprising 1,000 Korean–English parallel sentence pairs—and conduct a systematic evaluation of multiple on-device neural machine translation models across five real-world mobile devices. The analysis examines resource consumption, thermal behavior, and domain adaptation capabilities, revealing critical trade-offs among model performance, hardware constraints, and domain suitability. Notably, selected lightweight models achieve translation quality comparable to commercial systems such as GPT-5.1 under stringent on-device conditions, demonstrating a viable and efficient pathway for real-time on-device live chat translation.

Technology Category

Application Category

📝 Abstract

Despite its efficiency, there has been little research on the practical aspects required for real-world deployment of on-device AI models, such as the device's CPU utilization and thermal conditions. In this paper, through extensive experiments, we investigate two key issues that must be addressed to deploy on-device models in real-world services: (i) the selection of on-device models and the resource consumption of each model, and (ii) the capability and potential of on-device models for domain adaptation. To this end, we focus on a task of translating live-stream chat messages and manually construct LiveChatBench, a benchmark consisting of 1,000 Korean-English parallel sentence pairs. Experiments on five mobile devices demonstrate that, although serving a large and heterogeneous user base requires careful consideration of highly constrained deployment settings and model selection, the proposed approach nevertheless achieves performance comparable to commercial models such as GPT-5.1 on the well-targeted task. We expect that our findings will provide meaningful insights to the on-device AI community.

Problem

Research questions and friction points this paper is trying to address.

on-device AI

real-time translation

mobile deployment

resource consumption

domain adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

on-device AI

live-stream chat translation

resource-constrained deployment