Towards Quantum Machine Learning for Malicious Code Analysis

📅 2025-08-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited generalizability and constrained feature representation of classical machine learning in malware classification, this paper pioneers a systematic investigation into quantum machine learning (QML) for malware detection. We propose two hybrid quantum-classical models: a Quantum Multilayer Perceptron (QMLP) and a Quantum Convolutional Neural Network (QCNN). Input features—derived from API-Graph, EMBER, and AZ malware datasets—are encoded into quantum states via angle encoding; data re-uploading and quantum pooling are integrated to enable end-to-end differentiable training. Experimental results demonstrate binary classification accuracy ranging from 77% to 96%, and up to 95.7% for multiclass tasks. QMLP achieves superior accuracy on complex classification tasks, while QCNN significantly accelerates training. This work establishes a novel paradigm and provides empirical validation for leveraging QML in cybersecurity applications.

Technology Category

Application Category

📝 Abstract
Classical machine learning (CML) has been extensively studied for malware classification. With the emergence of quantum computing, quantum machine learning (QML) presents a paradigm-shifting opportunity to improve malware detection, though its application in this domain remains largely unexplored. In this study, we investigate two hybrid quantum-classical models -- a Quantum Multilayer Perceptron (QMLP) and a Quantum Convolutional Neural Network (QCNN), for malware classification. Both models utilize angle embedding to encode malware features into quantum states. QMLP captures complex patterns through full qubit measurement and data re-uploading, while QCNN achieves faster training via quantum convolution and pooling layers that reduce active qubits. We evaluate both models on five widely used malware datasets -- API-Graph, EMBER-Domain, EMBER-Class, AZ-Domain, and AZ-Class, across binary and multiclass classification tasks. Our results show high accuracy for binary classification -- 95-96% on API-Graph, 91-92% on AZ-Domain, and 77% on EMBER-Domain. In multiclass settings, accuracy ranges from 91.6-95.7% on API-Graph, 41.7-93.6% on AZ-Class, and 60.7-88.1% on EMBER-Class. Overall, QMLP outperforms QCNN in complex multiclass tasks, while QCNN offers improved training efficiency at the cost of reduced accuracy.
Problem

Research questions and friction points this paper is trying to address.

Investigating quantum machine learning for malware classification
Evaluating hybrid quantum-classical models on malware datasets
Comparing QMLP and QCNN performance in binary and multiclass tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid quantum-classical models for malware classification
Angle embedding encodes features into quantum states
Quantum convolution and pooling reduce active qubits