AQUA-LLM: Evaluating Accuracy, Quantization, and Adversarial Robustness Trade-offs in LLMs for Cybersecurity Question Answering

📅 2025-09-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deploying large language models (LLMs) for cybersecurity question-answering on resource-constrained edge devices faces three intertwined challenges: high computational overhead, accuracy degradation under quantization, and weakened adversarial robustness. Method: We propose AQUA-LLM, the first framework to systematically investigate the synergistic effects of quantization and task-specific fine-tuning. We evaluate four configurations—baseline, quantization-only, fine-tuning-only, and quantization-plus-fine-tuning—on cybersecurity QA tasks, jointly measuring accuracy, inference efficiency, and adversarial robustness. Results: Quantization alone improves efficiency but substantially harms both accuracy and robustness; in contrast, lightweight fine-tuning combined with quantization not only recovers but often exceeds the original model’s accuracy while significantly enhancing resistance to adversarial attacks. This synergy achieves an optimal trade-off among all three objectives. Our work establishes a reproducible optimization paradigm and empirical benchmark for deploying secure, efficient LLMs on edge devices.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have recently demonstrated strong potential for cybersecurity question answering (QA), supporting decision-making in real-time threat detection and response workflows. However, their substantial computational demands pose significant challenges for deployment on resource-constrained edge devices. Quantization, a widely adopted model compression technique, can alleviate these constraints. Nevertheless, quantization may degrade model accuracy and increase susceptibility to adversarial attacks. Fine-tuning offers a potential means to mitigate these limitations, but its effectiveness when combined with quantization remains insufficiently explored. Hence, it is essential to understand the trade-offs among accuracy, efficiency, and robustness. We propose AQUA-LLM, an evaluation framework designed to benchmark several state-of-the-art small LLMs under four distinct configurations: base, quantized-only, fine-tuned, and fine-tuned combined with quantization, specifically for cybersecurity QA. Our results demonstrate that quantization alone yields the lowest accuracy and robustness despite improving efficiency. In contrast, combining quantization with fine-tuning enhances both LLM robustness and predictive performance, achieving an optimal balance of accuracy, robustness, and efficiency. These findings highlight the critical need for quantization-aware, robustness-preserving fine-tuning methodologies to enable the robust and efficient deployment of LLMs for cybersecurity QA.
Problem

Research questions and friction points this paper is trying to address.

Evaluating accuracy, quantization, and adversarial robustness trade-offs in LLMs
Assessing cybersecurity question answering performance under resource constraints
Exploring fine-tuning combined with quantization for optimal LLM deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining quantization with fine-tuning for LLMs
Evaluating accuracy, efficiency, and robustness trade-offs
Quantization-aware robustness-preserving fine-tuning methodology
🔎 Similar Papers
No similar papers found.
Onat Gungor
Onat Gungor
UC San Diego
Machine LearningSecurityInternet of Things
R
Roshan Sood
Department of Computer Science and Engineering, University of California, San Diego
H
Harold Wang
Department of Computer Science and Engineering, University of California, San Diego
Tajana Rosing
Tajana Rosing
Distinguished Professor, UCSD
computer architecturecyber-physical systemssystem energy efficiency