Advancing LLM-Based Security Automation with Customized Group Relative Policy Optimization for Zero-Touch Networks

📅 2025-12-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The distributed, open, and heterogeneous nature of 6G Zero-Touch Networks (ZTNs) significantly expands the attack surface, necessitating security automation capable of operating in dynamic adversarial environments—yet existing approaches suffer from limited policy lifecycle automation and weak environmental/threat adaptability. Method: This paper proposes SecLoop, an LLM-driven fully automated security closed-loop system, and introduces SA-GRPO—the first security-aware group-relative policy optimization algorithm. SecLoop integrates large language models, MITRE ATT&CK knowledge, and multi-scenario parallel adversarial simulation to enable end-to-end policy generation, orchestration, response, and feedback; SA-GRPO employs intra-group comparative feedback to drive iterative policy reinforcement and dynamic evolution. Contribution/Results: Evaluated across five real-world benchmarks covering 11 ATT&CK tactics and 20+ attack types, SecLoop achieves a 37% improvement in policy accuracy and sub-second response latency. The platform will be open-sourced to advance automated security for 6G.

Technology Category

Application Category

📝 Abstract
Zero-Touch Networks (ZTNs) represent a transformative paradigm toward fully automated and intelligent network management, providing the scalability and adaptability required for the complexity of sixth-generation (6G) networks. However, the distributed architecture, high openness, and deep heterogeneity of 6G networks expand the attack surface and pose unprecedented security challenges. To address this, security automation aims to enable intelligent security management across dynamic and complex environments, serving as a key capability for securing 6G ZTNs. Despite its promise, implementing security automation in 6G ZTNs presents two primary challenges: 1) automating the lifecycle from security strategy generation to validation and update under real-world, parallel, and adversarial conditions, and 2) adapting security strategies to evolving threats and dynamic environments. This motivates us to propose SecLoop and SA-GRPO. SecLoop constitutes the first fully automated framework that integrates large language models (LLMs) across the entire lifecycle of security strategy generation, orchestration, response, and feedback, enabling intelligent and adaptive defenses in dynamic network environments, thus tackling the first challenge. Furthermore, we propose SA-GRPO, a novel security-aware group relative policy optimization algorithm that iteratively refines security strategies by contrasting group feedback collected from parallel SecLoop executions, thereby addressing the second challenge. Extensive real-world experiments on five benchmarks, including 11 MITRE ATT&CK processes and over 20 types of attacks, demonstrate the superiority of the proposed SecLoop and SA-GRPO. We will release our platform to the community, facilitating the advancement of security automation towards next generation communications.
Problem

Research questions and friction points this paper is trying to address.

Automates security lifecycle from strategy generation to validation in adversarial 6G networks
Adapts security strategies to evolving threats and dynamic network environments
Integrates LLMs for intelligent, adaptive defenses in zero-touch network management
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based automated security lifecycle framework SecLoop
Group relative policy optimization algorithm SA-GRPO
Contrastive group feedback for iterative strategy refinement
🔎 Similar Papers
No similar papers found.
X
Xinye Cao
National Engineering Research Center for Mobile Network Technologies, Beijing University of Posts and Telecommunications, China
Yihan Lin
Yihan Lin
Assistant Professor, Xiamen University
Brain inspired VisionDeep learningNeuromorphic engineeringComplex networks
Guoshun Nan
Guoshun Nan
Professor of Beijing University of Posts and Telecommunications
Multimodal LearningVideo LLM6G SecuritySemantic Communications
Q
Qinchuan Zhou
National Engineering Research Center for Mobile Network Technologies, Beijing University of Posts and Telecommunications, China
Y
Yuhang Luo
National Engineering Research Center for Mobile Network Technologies, Beijing University of Posts and Telecommunications, China
Y
Yurui Gao
National Engineering Research Center for Mobile Network Technologies, Beijing University of Posts and Telecommunications, China
Zeliang Zhang
Zeliang Zhang
PhD Candidate @ University of Rochester; BEng @ HUST
trustworthy and efficient AI
H
Haolang Lu
National Engineering Research Center for Mobile Network Technologies, Beijing University of Posts and Telecommunications, China
Qimei Cui
Qimei Cui
Professor , School of Information and Communication Engineering ,Beijing University of Posts and
B5G/6G wireless communicationsmobile computing and IoT
Y
Yanzhao Hou
National Engineering Research Center for Mobile Network Technologies, Beijing University of Posts and Telecommunications, China
Xiaofeng Tao
Xiaofeng Tao
Beijing University of Posts and Telecommunications
wireless communication
T
Tony Q. S. Quek
Singapore University of Technology and Design, Singapore 487372, and also with the Department of Electronic Engineering, Kyung Hee University, Yongin 17104, South Korea