🤖 AI Summary
To address critical challenges—including severe hallucination, weak dynamic adaptability, and complex resource scheduling—in existing large language model (LLM)-driven network orchestration for sixth-generation (6G) space-air-ground integrated networks (SAGIN) and semantic communication (SemCom), this paper proposes the Autonomous Orchestration Framework (ARC). Methodologically, ARC introduces a novel two-layer paradigm synergizing LLMs and continual reinforcement learning (RL): the LLM layer integrates retrieval-augmented generation (RAG), chain-of-thought (CoT) few-shot reasoning, and a hierarchical action planner (HAP) for high-level semantic orchestration; the RL layer employs replay buffer management and Mixture-of-Experts (MoE)-based contrastive learning to ensure robust low-level decision-making and long-term evolutionary capability. Experimental results demonstrate that ARC significantly outperforms baselines in service latency, resource utilization, and dynamic adaptability—achieving superior orchestration accuracy, efficiency, and continual learning capability.
📝 Abstract
6G networks aim to achieve global coverage, massive connectivity, and ultra-stringent requirements. Space-Air-Ground Integrated Networks (SAGINs) and Semantic Communication (SemCom) are essential for realizing these goals, yet they introduce considerable complexity in resource orchestration. Drawing inspiration from research in robotics, a viable solution to manage this complexity is the application of Large Language Models (LLMs). Although the use of LLMs in network orchestration has recently gained attention, existing solutions have not sufficiently addressed LLM hallucinations or their adaptation to network dynamics. To address this gap, this paper proposes a framework called Autonomous Reinforcement Coordination (ARC) for a SemCom-enabled SAGIN. This framework employs an LLM-based Retrieval-Augmented Generator (RAG) monitors services, users, and resources and processes the collected data, while a Hierarchical Action Planner (HAP) orchestrates resources. ARC decomposes orchestration into two tiers, utilizing LLMs for high-level planning and Reinforcement Learning (RL) agents for low-level decision-making, in alignment with the Mixture of Experts (MoE) concept. The LLMs utilize Chain-of-Thought (CoT) reasoning for few-shot learning, empowered by contrastive learning, while the RL agents employ replay buffer management for continual learning, thereby achieving efficiency, accuracy, and adaptability. Simulations are provided to demonstrate the effectiveness of ARC, along with a comprehensive discussion on potential future research directions to enhance and upgrade ARC.