🤖 AI Summary
This paper identifies “cyber threat inflation”—a phenomenon wherein LLM-driven autonomous cyberattack agents drastically reduce attack costs while amplifying scale and sophistication. Method: We propose the first four-layer capability model (Reconnaissance–Memory–Reasoning–Execution) to systematically analyze attack paradigms, efficacy disparities, and defensive bottlenecks across static, mobile, and infrastructure-less networks. Integrating multi-agent architecture, adversarial modeling, and situational awareness, we empirically evaluate six high-risk LLM-augmented attack pathways. Contribution/Results: We formally define “cyber threat inflation,” characterize cross-paradigm threat asymmetry, and reveal that existing defenses cover fewer than 40% of autonomous decision-making attacks. Furthermore, we design a three-tiered, legacy-system-compatible defense framework—progressive, adaptive, and operationally grounded—to provide actionable theoretical foundations and practical mitigation strategies for LLM-era cyber threats.
📝 Abstract
With the continuous evolution of Large Language Models (LLMs), LLM-based agents have advanced beyond passive chatbots to become autonomous cyber entities capable of performing complex tasks, including web browsing, malicious code and deceptive content generation, and decision-making. By significantly reducing the time, expertise, and resources, AI-assisted cyberattacks orchestrated by LLM-based agents have led to a phenomenon termed Cyber Threat Inflation, characterized by a significant reduction in attack costs and a tremendous increase in attack scale. To provide actionable defensive insights, in this survey, we focus on the potential cyber threats posed by LLM-based agents across diverse network systems. Firstly, we present the capabilities of LLM-based cyberattack agents, which include executing autonomous attack strategies, comprising scouting, memory, reasoning, and action, and facilitating collaborative operations with other agents or human operators. Building on these capabilities, we examine common cyberattacks initiated by LLM-based agents and compare their effectiveness across different types of networks, including static, mobile, and infrastructure-free paradigms. Moreover, we analyze threat bottlenecks of LLM-based agents across different network infrastructures and review their defense methods. Due to operational imbalances, existing defense methods are inadequate against autonomous cyberattacks. Finally, we outline future research directions and potential defensive strategies for legacy network systems.