Towards Efficient 3D Object Detection for Vehicle-Infrastructure Collaboration via Risk-Intent Selection

📅 2026-01-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

247K/year

🤖 AI Summary

This work addresses the tension between limited communication bandwidth and redundant feature transmission in vehicle-infrastructure cooperative perception by proposing a Risk-aware Intention-guided Selection mechanism (RiSe). RiSe uniquely integrates motion risk awareness and driving intention into feature selection. It employs a Potential Field–Trajectory Correlation Model (PTCM), grounded in potential field theory, to assess interaction risk, and couples it with an Intention-Driven BEV Area Prediction Module (IDAPM) to selectively transmit only high-fidelity features from high-risk, highly interactive regions. This paradigm shifts feature transmission from “visible areas” to “risk-critical areas.” Evaluated on the DeepAccident dataset, RiSe reduces communication volume to 0.71% of full-feature sharing while maintaining state-of-the-art detection accuracy, achieving a Pareto-optimal trade-off between bandwidth efficiency and perception performance.

Technology Category

Application Category

📝 Abstract

Vehicle-Infrastructure Collaborative Perception (VICP) is pivotal for resolving occlusion in autonomous driving, yet the trade-off between communication bandwidth and feature redundancy remains a critical bottleneck. While intermediate fusion mitigates data volume compared to raw sharing, existing frameworks typically rely on spatial compression or static confidence maps, which inefficiently transmit spatially redundant features from non-critical background regions. To address this, we propose Risk-intent Selective detection (RiSe), an interaction-aware framework that shifts the paradigm from identifying visible regions to prioritizing risk-critical ones. Specifically, we introduce a Potential Field-Trajectory Correlation Model (PTCM) grounded in potential field theory to quantitatively assess kinematic risks. Complementing this, an Intention-Driven Area Prediction Module (IDAPM) leverages ego-motion priors to proactively predict and filter key Bird's-Eye-View (BEV) areas essential for decision-making. By integrating these components, RiSe implements a semantic-selective fusion scheme that transmits high-fidelity features only from high-interaction regions, effectively acting as a feature denoiser. Extensive experiments on the DeepAccident dataset demonstrate that our method reduces communication volume to 0.71\% of full feature sharing while maintaining state-of-the-art detection accuracy, establishing a competitive Pareto frontier between bandwidth efficiency and perception performance.

Problem

Research questions and friction points this paper is trying to address.

Vehicle-Infrastructure Collaboration

3D Object Detection

Communication Efficiency

Feature Redundancy

Occlusion

Innovation

Methods, ideas, or system contributions that make the work stand out.

Risk-intent Selection

Vehicle-Infrastructure Collaboration

Semantic-selective Fusion