🤖 AI Summary
To address the suboptimal global performance arising from modular physical-layer design in 6G and the lack of joint reasoning over communication states and user intent in existing LLM-based approaches, this paper proposes the first multimodal semantic alignment decision-making model for intelligent communications. We innovatively establish a cross-modal semantic alignment mechanism between channel state information (CSI) and natural language instructions, and design a two-stage reinforcement learning framework that integrates multimodal deep learning, behavioral cloning, and multi-objective RL to enable end-to-end link-level policy generation. The model supports environment adaptation and personalized policy customization. Experimental results demonstrate significant performance gains over conventional algorithms under complex, dynamic channel conditions—achieving simultaneously low bit error rate, high throughput, and low computational complexity.
📝 Abstract
The emergence of sixth-generation (6G) networks heralds an intelligent communication ecosystem driven by AI-native air interfaces. However, current physical-layer designs-typically following modular and isolated optimization paradigms-fail to achieve global end-to-end optimality due to neglected inter-module dependencies. Although large language models (LLMs) have recently been applied to communication tasks such as beam prediction and resource allocation, existing studies remain limited to single-task or single-modality scenarios and lack the ability to jointly reason over communication states and user intents for personalized strategy adaptation. To address these limitations, this paper proposes a novel multimodal communication decision-making model based on reinforcement learning. The proposed model semantically aligns channel state information (CSI) and textual user instructions, enabling comprehensive understanding of both physical-layer conditions and communication intents. It then generates physically realizable, user-customized link construction strategies that dynamically adapt to changing environments and preference tendencies. A two-stage reinforcement learning framework is employed: the first stage expands the experience pool via heuristic exploration and behavior cloning to obtain a near-optimal initialization, while the second stage fine-tunes the model through multi-objective reinforcement learning considering bit error rate, throughput, and complexity. Experimental results demonstrate that the proposed model significantly outperforms conventional planning-based algorithms under challenging channel conditions, achieving robust, efficient, and personalized 6G link construction.