CoBEVMoE: Heterogeneity-aware Feature Fusion with Dynamic Mixture-of-Experts for Collaborative Perception

📅 2025-09-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In multi-agent collaborative perception, heterogeneous observations arise from differing viewpoints and spatial positions, while existing intermediate fusion methods overemphasize feature alignment at the expense of perceptual diversity. To address this, we propose a dynamic Mixture-of-Experts (MoE) fusion framework operating in Bird’s Eye View (BEV) space. Our approach introduces learnable dynamic expert networks that explicitly model inter-agent feature similarity and dissimilarity, and incorporates a Dynamic Expert Metric Loss (DEML) to jointly optimize expert diversity and discriminative capability. By integrating BEV feature alignment, dynamic MoE architecture, and DEML, our method achieves differentiated fusion of heterogeneous features. Experiments demonstrate state-of-the-art performance on OPV2V and DAIR-V2X-C: +1.5% IoU for camera-based BEV segmentation and +3.0% AP@50 for LiDAR-based 3D object detection.

Technology Category

Application Category

📝 Abstract
Collaborative perception aims to extend sensing coverage and improve perception accuracy by sharing information among multiple agents. However, due to differences in viewpoints and spatial positions, agents often acquire heterogeneous observations. Existing intermediate fusion methods primarily focus on aligning similar features, often overlooking the perceptual diversity among agents. To address this limitation, we propose CoBEVMoE, a novel collaborative perception framework that operates in the Bird's Eye View (BEV) space and incorporates a Dynamic Mixture-of-Experts (DMoE) architecture. In DMoE, each expert is dynamically generated based on the input features of a specific agent, enabling it to extract distinctive and reliable cues while attending to shared semantics. This design allows the fusion process to explicitly model both feature similarity and heterogeneity across agents. Furthermore, we introduce a Dynamic Expert Metric Loss (DEML) to enhance inter-expert diversity and improve the discriminability of the fused representation. Extensive experiments on the OPV2V and DAIR-V2X-C datasets demonstrate that CoBEVMoE achieves state-of-the-art performance. Specifically, it improves the IoU for Camera-based BEV segmentation by +1.5% on OPV2V and the AP@50 for LiDAR-based 3D object detection by +3.0% on DAIR-V2X-C, verifying the effectiveness of expert-based heterogeneous feature modeling in multi-agent collaborative perception. The source code will be made publicly available at https://github.com/godk0509/CoBEVMoE.
Problem

Research questions and friction points this paper is trying to address.

Addressing heterogeneous observations from different agent viewpoints
Modeling both feature similarity and heterogeneity across agents
Improving collaborative perception accuracy through dynamic expert fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Mixture-of-Experts architecture for heterogeneous feature fusion
Bird's Eye View space operation with agent-specific expert generation
Dynamic Expert Metric Loss to enhance inter-expert diversity
🔎 Similar Papers
No similar papers found.