🤖 AI Summary
This work addresses the limitations of existing mobile ad detection methods, which struggle to identify obfuscated or dynamically injected ads due to their reliance on static analysis—ignoring runtime behaviors—or heuristic-driven UI exploration that suffers from low efficiency and poor coverage. To overcome these challenges, we propose the first agent-based multimodal reasoning framework for ad detection, integrating static, visual, temporal, and interactive signals. Our approach employs a reasoning-guided UI navigation strategy that jointly determines exploration paths and regions of interest. By introducing multimodal agents into ad detection for the first time, the method achieves significant improvements: evaluated on over 200 commercial apps, it boosts detection accuracy by 30.5%–56.3% while reducing exploration steps by 29.7%–63.3%, substantially enhancing the discovery of obfuscated and malicious advertisements.
📝 Abstract
Mobile advertising dominates app monetization but introduces risks ranging from intrusive user experience to malware delivery. Existing detection methods rely either on static analysis, which misses runtime behaviors, or on heuristic UI exploration, which struggles with sparse and obfuscated ads. In this paper, we present MANA, the first agentic multimodal reasoning framework for mobile ad detection. MANA integrates static, visual, temporal, and experiential signals into a reasoning-guided navigation strategy that determines not only how to traverse interfaces but also where to focus, enabling efficient and robust exploration. We implement and evaluate MANA on commercial smartphones over 200 apps, achieving state-of-the-art accuracy and efficiency. Compared to baselines, it improves detection accuracy by 30.5%-56.3% and reduces exploration steps by 29.7%-63.3%. Case studies further demonstrate its ability to uncover obfuscated and malicious ads, underscoring its practicality for mobile ad auditing and its potential for broader runtime UI analysis (e.g., permission abuse). Code and dataset are available at https://github.com/MANA-2026/MANA.