DWM-RO: Decentralized World Models with Reasoning Offloading for SWIPT-enabled Satellite-Terrestrial HetNets

📅 2025-11-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In SWIPT-enabled satellite-terrestrial heterogeneous networks, time-varying channels and multi-layer interference pose significant challenges for distributed beamforming and power-splitting optimization. Method: This paper proposes the Decentralized World Model with Reasoning Offloading (DWM-RO) framework, which innovatively integrates world modeling with reasoning offloading. It introduces an uncertainty-driven offloading gating mechanism and an edge latent decorrelation module to enable low-overhead agent coordination and orthogonal policy generation, augmented by environment dynamics prediction, imagination-based policy training, coordination-triggered adaptation, and lightweight edge representation optimization. Results: Experiments demonstrate that DWM-RO achieves 5× faster convergence than state-of-the-art MARL approaches, improves spectral efficiency by 34.7%, and reduces constraint violation rate by 40%. Under dense 10-user scenarios, the violation rate remains below 20%, significantly enhancing robustness and scalability.

Technology Category

Application Category

📝 Abstract
Wireless networks are undergoing a paradigm shift toward massive connectivity with energy-efficient operation, driving the integration of satellite-terrestrial architectures with simultaneous wireless information and power transfer (SWIPT). Optimizing transmit beamforming and power splitting in such systems faces formidable challenges, e.g., time-varying channels and multi-tier interference, which create a complex decision landscape where conventional model-free multi-agent reinforcement learning (MARL) suffers from sample inefficiency due to rarely-encountered state transitions and poor coordination as decentralized agents act independently. This paper proposes the Decentralized World Model with Reasoning Offloading (DWM-RO) framework to address these fundamental limitations. Specifically, each agent employs a world model to learn compact predictive representations of environment dynamics, enabling imagination-based policy training that dramatically reduces required environment interactions. An uncertainty-aware offloading gate monitors local interference levels and model reconstruction errors to trigger selective edge coordination. When activated, a lightweight latent decorrelation mechanism at the edge refines agents'strategic representations, guiding them toward orthogonal actions that minimize resource conflicts. Extensive simulations demonstrate that DWM-RO converges 5 times faster than state-of-the-art baselines while achieving 34.7% higher spectral efficiency and reducing constraint violations by 40%. In dense network scenarios with 10 users, DWM-RO maintains violation rates below 20% while baselines exceed 70%, validating superior robustness.
Problem

Research questions and friction points this paper is trying to address.

Optimizing transmit beamforming and power splitting in satellite-terrestrial networks
Addressing sample inefficiency in multi-agent reinforcement learning for wireless systems
Reducing resource conflicts and interference in decentralized network coordination
Innovation

Methods, ideas, or system contributions that make the work stand out.

World models learn predictive environment representations for training
Uncertainty-aware offloading gate triggers selective edge coordination
Lightweight latent decorrelation mechanism refines agents' strategic representations
🔎 Similar Papers
No similar papers found.
G
Guangyuan Liu
College of Computing and Data Science, the Energy Research Institute @ NTU, Interdisciplinary Graduate Program, Nanyang Technological University, Singapore
Y
Yinqiu Liu
College of Computing and Data Science, Nanyang Technological University, Singapore
Ruichen Zhang
Ruichen Zhang
Nanyang Technological University
Next-generation NetworkingEdge IntelligenceAgentic AIReinforcement learningLLM
D
Dusit Niyato
College of Computing and Data Science, Nanyang Technological University, Singapore
J
Jiawen Kang
School of Automation, Guangdong University of Technology, China
Sumei Sun
Sumei Sun
Institute for Infocomm Research, A*STAR
5G/6Gintegrated sensing-communications-computing-controlapplied AIsecure & resilient comms
A
Abbas Jamalipour
School of Electrical and Computer Engineering, University of Sydney, Australia, and with the Graduate School of Information Sciences, Tohoku University, Japan
P
Ping Zhang
State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, China