π€ AI Summary
Modern malware increasingly employs obfuscation and polymorphism to evade conventional static analysis, rendering such approaches ineffective. To address this challenge, we propose BEACONβa behavior-based malware classification framework leveraging large language models (LLMs) and deep learning. BEACON directly processes raw behavioral reports generated by sandboxing environments; it employs an LLM to produce context-aware semantic embeddings of behavioral events and applies a 1D convolutional neural network (CNN) to capture local structural patterns in behavioral sequences, thereby enabling robust identification of sophisticated variants. Evaluated on the Avast-CTU Public CAPE dataset, BEACON significantly outperforms existing behavioral detection methods, achieving both high accuracy and strong generalization across multi-class malware classification tasks. These results empirically validate the efficacy of semantic-driven behavioral representation learning for malware analysis.
π Abstract
Malware is becoming increasingly complex and widespread, making it essential to develop more effective and timely detection methods. Traditional static analysis often fails to defend against modern threats that employ code obfuscation, polymorphism, and other evasion techniques. In contrast, behavioral malware detection, which monitors runtime activities, provides a more reliable and context-aware solution. In this work, we propose BEACON, a novel deep learning framework that leverages large language models (LLMs) to generate dense, contextual embeddings from raw sandbox-generated behavior reports. These embeddings capture semantic and structural patterns of each sample and are processed by a one-dimensional convolutional neural network (1D CNN) for multi-class malware classification. Evaluated on the Avast-CTU Public CAPE Dataset, our framework consistently outperforms existing methods, highlighting the effectiveness of LLM-based behavioral embeddings and the overall design of BEACON for robust malware classification.