🤖 AI Summary
Android malware detection faces challenges due to inconsistencies between UI-inferred user intent and actual runtime behavior. Method: This paper proposes UIXPOSE—a source-code-free framework that leverages vision-language models (VLMs) to parse user intent from UI screenshots, and integrates heterogeneous runtime signals—including network payloads, memory stacks, and system resource usage—to construct fine-grained behavioral vectors. It dynamically quantifies intent-behavior alignment deviation for real-time detection. Contribution/Results: UIXPOSE introduces the Intent-Behavior Alignment (IBA) paradigm to mobile malware detection for the first time, overcoming limitations of static permission analysis and coarse-grained monitoring. By incorporating knowledge graph–enhanced intent modeling, it enables context-aware anomaly localization. Evaluated on real-world cases, UIXPOSE accurately detects stealthy data exfiltration and background covert activities, achieving significantly higher detection accuracy and interpretability than metadata-based baselines.
📝 Abstract
We introduce UIXPOSE, a source-code-agnostic framework that operates on both compiled and open-source apps. This framework applies Intention Behaviour Alignment (IBA) to mobile malware analysis, aligning UI-inferred intent with runtime semantics. Previous work either infers intent statically, e.g., permission-centric, or widget-level or monitors coarse dynamic signals (endpoints, partial resource usage) that miss content and context. UIXPOSE infers an intent vector from each screen using vision-language models and knowledge structures and combines decoded network payloads, heap/memory signals, and resource utilisation traces into a behaviour vector. Their alignment, calculated at runtime, can both detect misbehaviour and highlight exploration of behaviourally rich paths. In three real-world case studies, UIXPOSE reveals covert exfiltration and hidden background activity that evade metadata-only baselines, demonstrating how IBA improves dynamic detection.