Dynamical Multimodal Fusion with Mixture-of-Experts for Localizations

📅 2025-07-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address two critical bottlenecks in 6G integrated sensing and communication (ISAC) multi-modal fingerprinting localization—frequency-dependent modal contribution drift and spatial/fingerprint ambiguity-induced accuracy degradation under dynamic spectrum and non-line-of-sight (NLOS) conditions—this paper proposes a spatial-context-aware dynamic fusion network. We introduce the first large-scale multi-modal mixture-of-experts (MoE) system, featuring a novel modality-task dual-level architecture and a learnable routing mechanism. The framework integrates trajectory-clustering-based representation learning, multi-task coordinate regression, and maximum mean discrepancy regularization to jointly optimize frequency adaptability and expert diversity. Evaluated across three real-world urban environments and three carrier frequencies (2.6/6/28 GHz), the method achieves stable sub-meter mean squared error. For unseen NLOS scenarios, it reduces localization error by 50% over state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
Multimodal fingerprinting is a crucial technique to sub-meter 6G integrated sensing and communications (ISAC) localization, but two hurdles block deployment: (i) the contribution each modality makes to the target position varies with the operating conditions such as carrier frequency, and (ii) spatial and fingerprint ambiguities markedly undermine localization accuracy, especially in non-line-of-sight (NLOS) scenarios. To solve these problems, we introduce SCADF-MoE, a spatial-context aware dynamic fusion network built on a soft mixture-of-experts backbone. SCADF-MoE first clusters neighboring points into short trajectories to inject explicit spatial context. Then, it adaptively fuses channel state information, angle of arrival profile, distance, and gain through its learnable MoE router, so that the most reliable cues dominate at each carrier band. The fused representation is fed to a modality-task MoE that simultaneously regresses the coordinates of every vertex in the trajectory and its centroid, thereby exploiting inter-point correlations. Finally, an auxiliary maximum-mean-discrepancy loss enforces expert diversity and mitigates gradient interference, stabilizing multi-task training. On three real urban layouts and three carrier bands (2.6, 6, 28 GHz), the model delivers consistent sub-meter MSE and halves unseen-NLOS error versus the best prior work. To our knowledge, this is the first work that leverages large-scale multimodal MoE for frequency-robust ISAC localization.
Problem

Research questions and friction points this paper is trying to address.

Dynamic fusion of varying multimodal contributions for 6G localization
Reducing spatial and fingerprint ambiguities in NLOS scenarios
Achieving frequency-robust sub-meter accuracy in ISAC systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatial-context aware dynamic fusion network
Learnable MoE router for adaptive fusion
Modality-task MoE exploiting inter-point correlations
🔎 Similar Papers
No similar papers found.
Bohao Wang
Bohao Wang
College of Information Science & Electronic Engineering, Zhejiang University
Wireless AICommunication6GDigital TwinRay Tracing
Zitao Shuai
Zitao Shuai
UCLA; University of Michigan
Fenghao Zhu
Fenghao Zhu
浙江大学
BeamformingEnergy EfficiencyOptimization
C
Chongwen Huang
College of Information Science and Electronic Engineering, Zhejiang University, 310027, Hangzhou, China; State Key Laboratory of Integrated Service Networks, Xidian University, 710071, Xi’an, China
Y
Yongliang Shen
College of Computer Science and Technology, Zhejiang University, 310027, Hangzhou, China
Z
Zhaoyang Zhang
College of Information Science and Electronic Engineering, Zhejiang University, 310027, Hangzhou, China
Qianqian Yang
Qianqian Yang
Zhejiang University
Information TheoryWireless AISemantic CommunicationMachine Learning
Sami Muhaidat
Sami Muhaidat
Professor, Khalifa University; Adjunct Professor, Carleton University
Wireless CommunicationsMachine LearningOptical Wireless CommunicationV2V
Merouane Debbah
Merouane Debbah
KU 6G Center, Khalifa University, Centralesupelec
6GLarge Language ModelsAIRandom Matrix TheoryGame Theory