🤖 AI Summary
To address three critical challenges—quantization-induced accuracy degradation, high adversarial vulnerability, and excessive inference latency—in deploying large language models (LLMs) on edge devices for cybersecurity question-answering tasks, this paper proposes an end-to-end edge-aligned defense framework. Our method uniquely integrates quantized low-rank adaptation (QLoRA) with unsupervised direct preference optimization (DPO), leveraging a custom-built cybersecurity preference dataset to achieve domain-specific alignment. The framework enables efficient fine-tuning and inference optimization on the Jetson Orin platform. Experiments demonstrate that our approach reduces adversarial attack success rates by up to 7.3×, improves QA accuracy by up to 55%, and achieves the lowest inference latency among current state-of-the-art methods. By jointly optimizing efficiency, robustness, and accuracy, our work validates a secure, lightweight, and deployable edge-LLM paradigm.
📝 Abstract
Large Language Models (LLMs) are highly effective for cybersecurity question answering (QA) but are difficult to deploy on edge devices due to their size. Quantization reduces memory and compute requirements but often degrades accuracy and increases vulnerability to adversarial attacks. We present EAGER, an edge-aligned defense framework that integrates parameter-efficient quantization with domain-specific preference alignment to jointly optimize efficiency, robustness, and accuracy. Unlike prior methods that address these aspects separately, EAGER leverages Quantized Low-Rank Adaptation (QLoRA) for low-cost fine-tuning and Direct Preference Optimization (DPO) on a self-constructed cybersecurity preference dataset, eliminating the need for human labels. Experiments show that EAGER reduces adversarial attack success rates by up to 7.3x and improves QA accuracy by up to 55% over state-of-the-art defenses, while achieving the lowest response latency on a Jetson Orin, demonstrating its practical edge deployment.