Enhancing Cloud Network Resilience via a Robust LLM-Empowered Multi-Agent Reinforcement Learning Framework

📅 2026-01-12
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited robustness, adaptability, and interpretability of existing reinforcement learning–based cloud network defense methods under dynamic environments and evolving attack strategies, particularly due to the absence of human-in-the-loop mechanisms. To overcome these limitations, the authors propose CyberOps-Bots, a novel autonomous defense framework that integrates large language models (LLMs) with hierarchical multi-agent reinforcement learning. In this architecture, a high-level LLM agent performs global tactical planning through ReAct-style reasoning grounded in the IPDRR cyber defense model, while low-level RL agents execute localized defensive actions. The design incorporates a MITRE ATT&CK–inspired two-tier structure and a heterogeneous disentangled pretraining mechanism. Crucially, the framework supports human-in-the-loop intervention and adapts to new scenarios without retraining. Experiments on real-world cloud datasets demonstrate a 68.5% improvement in network availability and a 34.7% performance gain during cross-scenario transfer.

Technology Category

Application Category

📝 Abstract
While virtualization and resource pooling empower cloud networks with structural flexibility and elastic scalability, they inevitably expand the attack surface and challenge cyber resilience. Reinforcement Learning (RL)-based defense strategies have been developed to optimize resource deployment and isolation policies under adversarial conditions, aiming to enhance system resilience by maintaining and restoring network availability. However, existing approaches lack robustness as they require retraining to adapt to dynamic changes in network structure, node scale, attack strategies, and attack intensity. Furthermore, the lack of Human-in-the-Loop (HITL) support limits interpretability and flexibility. To address these limitations, we propose CyberOps-Bots, a hierarchical multi-agent reinforcement learning framework empowered by Large Language Models (LLMs). Inspired by MITRE ATT&CK's Tactics-Techniques model, CyberOps-Bots features a two-layer architecture: (1) An upper-level LLM agent with four modules--ReAct planning, IPDRR-based perception, long-short term memory, and action/tool integration--performs global awareness, human intent recognition, and tactical planning; (2) Lower-level RL agents, developed via heterogeneous separated pre-training, execute atomic defense actions within localized network regions. This synergy preserves LLM adaptability and interpretability while ensuring reliable RL execution. Experiments on real cloud datasets show that, compared to state-of-the-art algorithms, CyberOps-Bots maintains network availability 68.5% higher and achieves a 34.7% jumpstart performance gain when shifting the scenarios without retraining. To our knowledge, this is the first study to establish a robust LLM-RL framework with HITL support for cloud defense. We will release our framework to the community, facilitating the advancement of robust and autonomous defense in cloud networks.
Problem

Research questions and friction points this paper is trying to address.

cloud network resilience
robustness
dynamic adversarial environments
Human-in-the-Loop
reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-empowered multi-agent reinforcement learning
cyber resilience
Human-in-the-Loop (HITL)
hierarchical defense framework
cloud network security
🔎 Similar Papers
No similar papers found.
Y
Yixiao Peng
State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, China, and also with the Henan Key Laboratory of Information Security, Zhengzhou, China
Hao Hu
Hao Hu
Anhui University, China
statistical physicssoft matterpercolationphase transitions
F
Feiyang Li
State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, China, and also with the Henan Key Laboratory of Information Security, Zhengzhou, China
X
Xinye Cao
National Engineering Research Center for Mobile Network Technologies, Beijing University of Posts and Telecommunications, China
Y
Yingchang Jiang
State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, China, and also with the Henan Key Laboratory of Information Security, Zhengzhou, China
J
Jipeng Tang
State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou, China, and also with the Henan Key Laboratory of Information Security, Zhengzhou, China
Guoshun Nan
Guoshun Nan
Professor of Beijing University of Posts and Telecommunications
Multimodal LearningVideo LLM6G SecuritySemantic Communications
Y
Yuling Liu
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China