🤖 AI Summary
This work addresses the challenge of balancing efficiency, performance, and deployment cost in multilingual translation models for real-world complex scenarios by introducing a family of models supporting 33 languages across three scales: 1.8B, 7B, and 30B-A3B (Mixture-of-Experts). Leveraging a Mixture-of-Experts architecture, AngelSlim 1.25-bit extreme quantization, and multilingual instruction fine-tuning, the study achieves a high-performance lightweight model (1.8B at only 440 MB). Experimental results demonstrate that the 1.8B model outperforms commercial APIs from Microsoft and Doubao while achieving 1.5× faster inference; the 7B and 30B models surpass DeepSeek-V4-Pro and Kimi K2.6 in fast-thinking mode, establishing state-of-the-art performance across general, commercial, domain-specific, and instruction-following tasks compared to leading open-source and commercial systems.
📝 Abstract
Hy-MT2 is a family of fast-thinking multilingual translation models designed for complex real-world scenarios. It includes three model sizes: 1.8B, 7B, and 30B-A3B (MoE), all of which support translation among 33 languages and effectively follow translation instructions in multiple languages. For on-device deployment, with AngelSlim 1.25-bit extreme quantization, the 1.8B model requires only 440 MB of storage and improves inference speed by 1.5x. Multi-dimensional evaluations show that Hy-MT2 delivers outstanding performance across general, real-world business, domain-specific, and instruction-following translation tasks. The 7B and 30B models outperform open-source models such as DeepSeek-V4-Pro and Kimi K2.6 in fast-thinking mode, while the lightweight 1.8B model also surpasses mainstream commercial APIs from providers such as Microsoft and Doubao overall.