Evaluating Retrieval-Augmented Generation for Explainable Malware Analysis

📅 2026-05-04

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This study investigates whether retrieval-augmented generation (RAG) genuinely enhances the performance of large language models (LLMs) in interpretable malware analysis, particularly when structured security data—such as VirusTotal reports—is already available. Through systematic comparisons of multiple LLMs with and without RAG, complemented by both qualitative and quantitative evaluations, the research reveals that when high-quality structured evidence is present, RAG often introduces distracting or weakly relevant context, leading to narrative noise and overly generalized explanations that degrade both accuracy and clarity. These findings challenge the prevailing assumption that RAG is universally beneficial and instead suggest that malware interpretation is fundamentally a signal extraction task rather than a knowledge retrieval problem, urging caution in deploying RAG for security-critical applications.

📝 Abstract

Large Language Models (LLMs) are increasingly being used as security engineering tools to summarize and explain malware behavior to analysts. A common assumption is that Retrieval-Augmented Generation (RAG) improves explanation quality by injecting external security knowledge. In this work, we empirically evaluate this assumption for malware explanation using VirusTotal reports as structured input. Across multiple LLMs, we find that RAG frequently degrades explanation quality by introducing distracting or weakly related context and adding narrative noise or generic write-ups. Our results highlight a practical risk in security-critical pipelines for malware explanation that RAG can be counterproductive when structured security evidence is already sufficient. We argue that malware explanation is primarily a signal-extraction task, not a knowledge-retrieval problem, and outline design recommendations for secure development workflows.

Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation

Malware Analysis

Large Language Models

Explainability

Security Engineering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation

Malware Analysis

Explainable AI