DWM-RO: Decentralized World Models with Reasoning Offloading for SWIPT-enabled Satellite-Terrestrial HetNets

📅 2025-11-08

📈 Citations: 0

✨ Influential: 0

career value

244K/year

🤖 AI Summary

In SWIPT-enabled satellite-terrestrial heterogeneous networks, time-varying channels and multi-layer interference pose significant challenges for distributed beamforming and power-splitting optimization. Method: This paper proposes the Decentralized World Model with Reasoning Offloading (DWM-RO) framework, which innovatively integrates world modeling with reasoning offloading. It introduces an uncertainty-driven offloading gating mechanism and an edge latent decorrelation module to enable low-overhead agent coordination and orthogonal policy generation, augmented by environment dynamics prediction, imagination-based policy training, coordination-triggered adaptation, and lightweight edge representation optimization. Results: Experiments demonstrate that DWM-RO achieves 5× faster convergence than state-of-the-art MARL approaches, improves spectral efficiency by 34.7%, and reduces constraint violation rate by 40%. Under dense 10-user scenarios, the violation rate remains below 20%, significantly enhancing robustness and scalability.

Technology Category

Application Category

📝 Abstract

Wireless networks are undergoing a paradigm shift toward massive connectivity with energy-efficient operation, driving the integration of satellite-terrestrial architectures with simultaneous wireless information and power transfer (SWIPT). Optimizing transmit beamforming and power splitting in such systems faces formidable challenges, e.g., time-varying channels and multi-tier interference, which create a complex decision landscape where conventional model-free multi-agent reinforcement learning (MARL) suffers from sample inefficiency due to rarely-encountered state transitions and poor coordination as decentralized agents act independently. This paper proposes the Decentralized World Model with Reasoning Offloading (DWM-RO) framework to address these fundamental limitations. Specifically, each agent employs a world model to learn compact predictive representations of environment dynamics, enabling imagination-based policy training that dramatically reduces required environment interactions. An uncertainty-aware offloading gate monitors local interference levels and model reconstruction errors to trigger selective edge coordination. When activated, a lightweight latent decorrelation mechanism at the edge refines agents'strategic representations, guiding them toward orthogonal actions that minimize resource conflicts. Extensive simulations demonstrate that DWM-RO converges 5 times faster than state-of-the-art baselines while achieving 34.7% higher spectral efficiency and reducing constraint violations by 40%. In dense network scenarios with 10 users, DWM-RO maintains violation rates below 20% while baselines exceed 70%, validating superior robustness.

Problem

Research questions and friction points this paper is trying to address.

Optimizing transmit beamforming and power splitting in satellite-terrestrial networks

Addressing sample inefficiency in multi-agent reinforcement learning for wireless systems

Reducing resource conflicts and interference in decentralized network coordination

Innovation

Methods, ideas, or system contributions that make the work stand out.

World models learn predictive environment representations for training

Uncertainty-aware offloading gate triggers selective edge coordination

Lightweight latent decorrelation mechanism refines agents' strategic representations

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

Authors to Follow