🤖 AI Summary
This work addresses the challenge of feature domain inconsistency in collaborative perception among heterogeneous autonomous vehicles, which arises from differences in sensors and perception models. Existing approaches suffer from high computational costs and potential privacy risks. To overcome these limitations, the authors propose a lightweight, privacy-preserving collaborative perception framework that aligns heterogeneous features into a unified space by fine-tuning only 0.06% of parameters via low-rank visual prompts, followed by pyramid-based feature fusion for efficient and robust aggregation. The method enables seamless adaptation to new vehicle types without extensive retraining. Evaluated on the OPV2V-H dataset, it achieves a 2% improvement in detection performance over state-of-the-art methods while reducing trainable parameters by 94%, effectively balancing efficiency, accuracy, and scalability.
📝 Abstract
Collaborative perception (CP) is a promising paradigm for improving situational awareness in autonomous vehicles by overcoming the limitations of single-agent perception. However, most existing approaches assume homogeneous agents, which restricts their applicability in real-world scenarios where vehicles use diverse sensors and perception models. This heterogeneity introduces a feature domain gap that degrades detection performance. Prior works address this issue by retraining entire models/major components, or using feature interpreters for each new agent type, which is computationally expensive, compromises privacy, and may reduce single-agent accuracy. We propose Faster-HEAL, a lightweight and privacy-preserving CP framework that fine-tunes a low-rank visual prompt to align heterogeneous features with a unified feature space while leveraging pyramid fusion for robust feature aggregation. This approach reduces the trainable parameters by 94%, enabling efficient adaptation to new agents without retraining large models. Experiments on the OPV2V-H dataset show that Faster-HEAL improves detection performance by 2% over state-of-the-art methods with significantly lower computational overhead, offering a practical solution for scalable heterogeneous CP.