ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence on Mobile Devices

πŸ“… 2026-02-25
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limitations of current mobile multimodal large language models, which are largely confined to passive responses and lack both proactive perception of users’ implicit intentions and the ability to generate executable actions, as well as a systematic evaluation benchmark. We propose the first comprehensive benchmark for proactive intelligence in mobile scenarios, inferring user intent through four-dimensional contextual signals and generating executable function sequences spanning 14 real-world categories, 63 APIs, and over 3,660 instances. An evaluation framework supporting multi-reference annotations and expert review is introduced, alongside a fine-tuned model based on Qwen2.5-VL-7B-Instruct. Experimental results demonstrate that our approach achieves a success rate of 19.15%, significantly outperforming o1 (15.71%) and GPT-5 (7.39%), thereby validating both the learnability of proactive intelligence and the effectiveness of the proposed benchmark.

Technology Category

Application Category

πŸ“ Abstract
Multimodal large language models (MLLMs) have made significant progress in mobile agent development, yet their capabilities are predominantly confined to a reactive paradigm, where they merely execute explicit user commands. The emerging paradigm of proactive intelligence, where agents autonomously anticipate needs and initiate actions, represents the next frontier for mobile agents. However, its development is critically bottlenecked by the lack of benchmarks that can address real-world complexity and enable objective, executable evaluation. To overcome these challenges, we introduce ProactiveMobile, a comprehensive benchmark designed to systematically advance research in this domain. ProactiveMobile formalizes the proactive task as inferring latent user intent across four dimensions of on-device contextual signals and generating an executable function sequence from a comprehensive function pool of 63 APIs. The benchmark features over 3,660 instances of 14 scenarios that embrace real-world complexity through multi-answer annotations. To ensure quality, a team of 30 experts conducts a final audit of the benchmark, verifying factual accuracy, logical consistency, and action feasibility, and correcting any non-compliant entries. Extensive experiments demonstrate that our fine-tuned Qwen2.5-VL-7B-Instruct achieves a success rate of 19.15%, outperforming o1 (15.71%) and GPT-5 (7.39%). This result indicates that proactivity is a critical competency widely lacking in current MLLMs, yet it is learnable, emphasizing the importance of the proposed benchmark for proactivity evaluation.
Problem

Research questions and friction points this paper is trying to address.

proactive intelligence
mobile agents
benchmark
multimodal large language models
contextual signals
Innovation

Methods, ideas, or system contributions that make the work stand out.

proactive intelligence
mobile agents
multimodal large language models
benchmark
executable evaluation
πŸ”Ž Similar Papers
No similar papers found.
D
Dezhi Kong
HyperAI Team, Xiaomi Corporation
Z
Zhengzhao Feng
HyperAI Team, Xiaomi Corporation; Zhejiang University
Q
Qiliang Liang
HyperAI Team, Xiaomi Corporation; Peking University
H
Hao Wang
HyperAI Team, Xiaomi Corporation
H
Haofei Sun
HyperAI Team, Xiaomi Corporation
C
Changpeng Yang
HyperAI Team, Xiaomi Corporation
Y
Yang Li
HyperAI Team, Xiaomi Corporation
P
Peng Zhou
HyperAI Team, Xiaomi Corporation
S
Shuai Nie
HyperAI Team, Xiaomi Corporation
H
Hongzhen Wang
HyperAI Team, Xiaomi Corporation
L
Linfeng Zhou
HyperAI Team, Xiaomi Corporation; Northeastern University
Hao Jia
Hao Jia
School of Medicine, Nankai University
brain signal processingpattern decoding
Jiaming Xu
Jiaming Xu
Xiaomi Corp.; before at CASIA
Speech and Language ProcessingSpeech SeparationDialogue System
R
Runyu Shi
HyperAI Team, Xiaomi Corporation
Y
Ying Huang
HyperAI Team, Xiaomi Corporation