🤖 AI Summary
Android malware detection faces challenges including rapid attack evolution, severe data bias, poor interpretability, and limitations of large language models (LLMs) in handling long contexts and understanding code structure. Method: This paper proposes a context-driven LLM analysis framework featuring (i) novel security-critical context extraction and program graph structural modeling; (ii) a three-tier hierarchical reasoning paradigm—mapping instructions to logic to semantics; and (iii) first-layer factual consistency verification to mitigate hallucination. The framework integrates static analysis, graph neural network modeling, hierarchical prompt engineering, and zero-shot inference. Contribution/Results: Experiments demonstrate that the framework significantly outperforms conventional detectors in realistic scenarios, achieving high accuracy, strong interpretability, and robustness against evolving threats. It establishes a new paradigm for LLM-augmented malware analysis under dynamic threat conditions.
📝 Abstract
The rapid growth of mobile applications has escalated Android malware threats. Although there are numerous detection methods, they often struggle with evolving attacks, dataset biases, and limited explainability. Large Language Models (LLMs) offer a promising alternative with their zero-shot inference and reasoning capabilities. However, applying LLMs to Android malware detection presents two key challenges: (1)the extensive support code in Android applications, often spanning thousands of classes, exceeds LLMs' context limits and obscures malicious behavior within benign functionality; (2)the structural complexity and interdependencies of Android applications surpass LLMs' sequence-based reasoning, fragmenting code analysis and hindering malicious intent inference. To address these challenges, we propose LAMD, a practical context-driven framework to enable LLM-based Android malware detection. LAMD integrates key context extraction to isolate security-critical code regions and construct program structures, then applies tier-wise code reasoning to analyze application behavior progressively, from low-level instructions to high-level semantics, providing final prediction and explanation. A well-designed factual consistency verification mechanism is equipped to mitigate LLM hallucinations from the first tier. Evaluation in real-world settings demonstrates LAMD's effectiveness over conventional detectors, establishing a feasible basis for LLM-driven malware analysis in dynamic threat landscapes.