INSTINCT: Instance-Level Interaction Architecture for Query-Based Collaborative Perception

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LiDAR-based multi-vehicle collaborative perception methods suffer from limited accuracy in long-range detection and occluded scenarios, while being severely constrained by stringent communication bandwidth requirements. To address these challenges, this paper proposes a query-driven, instance-level collaborative perception framework. Its core contributions are: (1) a quality-aware dynamic query selection mechanism that suppresses transmission of low signal-to-noise-ratio instances; (2) a dual-branch detection routing strategy that decouples local inference from collaborative optimization; and (3) a cross-agent local instance fusion module enabling lightweight and efficient feature alignment. Evaluated on DAIR-V2X and V2V4Real benchmarks, the method achieves absolute mAP improvements of 13.23% and 33.08%, respectively, while reducing communication bandwidth to merely 1/281 and 1/264 of the current state-of-the-art—demonstrating unprecedented balance between accuracy and efficiency.

Technology Category

Application Category

📝 Abstract
Collaborative perception systems overcome single-vehicle limitations in long-range detection and occlusion scenarios by integrating multi-agent sensory data, improving accuracy and safety. However, frequent cooperative interactions and real-time requirements impose stringent bandwidth constraints. Previous works proves that query-based instance-level interaction reduces bandwidth demands and manual priors, however, LiDAR-focused implementations in collaborative perception remain underdeveloped, with performance still trailing state-of-the-art approaches. To bridge this gap, we propose INSTINCT (INSTance-level INteraCtion ArchiTecture), a novel collaborative perception framework featuring three core components: 1) a quality-aware filtering mechanism for high-quality instance feature selection; 2) a dual-branch detection routing scheme to decouple collaboration-irrelevant and collaboration-relevant instances; and 3) a Cross Agent Local Instance Fusion module to aggregate local hybrid instance features. Additionally, we enhance the ground truth (GT) sampling technique to facilitate training with diverse hybrid instance features. Extensive experiments across multiple datasets demonstrate that INSTINCT achieves superior performance. Specifically, our method achieves an improvement in accuracy 13.23%/33.08% in DAIR-V2X and V2V4Real while reducing the communication bandwidth to 1/281 and 1/264 compared to state-of-the-art methods. The code is available at https://github.com/CrazyShout/INSTINCT.
Problem

Research questions and friction points this paper is trying to address.

Reducing bandwidth demands in collaborative perception systems
Improving LiDAR-based query interaction for occlusion scenarios
Enhancing detection accuracy while minimizing communication overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

Quality-aware filtering mechanism selects high-quality instance features
Dual-branch detection routing decouples collaboration-relevant instances
Cross Agent Local Instance Fusion aggregates hybrid instance features
🔎 Similar Papers
No similar papers found.
Y
Yunjiang Xu
School of Computer Science and Technology, Soochow University
Lingzhi Li
Lingzhi Li
Tongji University
Fire resistanceStrengthening and retrofitHigh performance concrete materials
J
Jin Wang
School of Future Science and Engineering, Soochow University
Y
Yupeng Ouyang
School of Computer Science and Technology, Soochow University
B
Benyuan Yang
School of Future Science and Engineering, Soochow University