🤖 AI Summary
This work addresses the challenge of deploying high-performance AI in mobile MOBA games, where large models incur prohibitive latency, energy consumption, and compression difficulties on resource-constrained devices. The authors propose a Pareto-optimal guided knowledge distillation framework, integrated with a mobile-oriented efficient student architecture search space, to jointly optimize AI performance and deployment efficiency for the first time in a systematic manner. By leveraging neural architecture search, multimodal state compression, and lightweight design principles, the resulting model achieves a per-frame inference time under 0.5 ms (12.4× faster) and consumes less than 0.5 mAh per match (15.6× more energy-efficient), while maintaining a 40.32% win rate against the original teacher model.
📝 Abstract
Recent advances in game AI have demonstrated the feasibility of training agents that surpass top-tier human professionals in complex environments such as Honor of Kings (HoK), a leading mobile multiplayer online battle arena (MOBA) game. However, deploying such powerful agents on mobile devices remains a major challenge. On one hand, the intricate multi-modal state representation and hierarchical action space of HoK demand large, sophisticated policy networks that are inherently difficult to compress into lightweight forms. On the other hand, production deployment requires high-frequency inference under strict energy and latency constraints on mobile platform. To the best of our knowledge, bridging large-scale game AI and practical on-device deployment has not been systematically studied. In this work, we propose a Pareto optimality guided pipeline and design a high-efficiency student architecture search space tailored for mobile execution, enabling systematic exploration of the trade-off between performance and efficiency. Experimental results demonstrate that the distilled model achieves remarkable efficiency, including an $12.4\times$ faster inference speed (under 0.5ms per frame) and a $15.6\times$ improvement in energy efficiency (under 0.5mAh per game), while retaining a 40.32% win rate against the original teacher model.