🤖 AI Summary
Collaborative perception (CP) is widely regarded as a pivotal solution to inherent limitations of single-vehicle perception—particularly under occlusion and at long ranges—yet the field lacks a systematic, rigorous survey. This study conducts a comprehensive review of 106 peer-reviewed papers, strictly adhering to the PRISMA 2020 guidelines. We propose a novel three-dimensional taxonomy—“modality–collaboration paradigm–perception task”—grounded in computer vision principles to characterize advances and bottlenecks in CP research. Crucially, we identify a structural misalignment between prevailing evaluation metrics and CP’s core objectives, such as collaborative robustness and spatiotemporal consistency. We further pinpoint five fundamental challenges: pose estimation errors, communication latency, domain shift, sensor/agent heterogeneity, and adversarial vulnerability. To address these, we introduce a reliability-oriented evaluation framework and provide a reproducible academic benchmark alongside a technical roadmap for algorithm design, system deployment, and standardization.
📝 Abstract
The effectiveness of autonomous vehicles relies on reliable perception capabilities. Despite significant advancements in artificial intelligence and sensor fusion technologies, current single-vehicle perception systems continue to encounter limitations, notably visual occlusions and limited long-range detection capabilities. Collaborative Perception (CP), enabled by Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communication, has emerged as a promising solution to mitigate these issues and enhance the reliability of autonomous systems. Beyond advancements in communication, the computer vision community is increasingly focusing on improving vehicular perception through collaborative approaches. However, a systematic literature review that thoroughly examines existing work and reduces subjective bias is still lacking. Such a systematic approach helps identify research gaps, recognize common trends across studies, and inform future research directions. In response, this study follows the PRISMA 2020 guidelines and includes 106 peer-reviewed articles. These publications are analyzed based on modalities, collaboration schemes, and key perception tasks. Through a comparative analysis, this review illustrates how different methods address practical issues such as pose errors, temporal latency, communication constraints, domain shifts, heterogeneity, and adversarial attacks. Furthermore, it critically examines evaluation methodologies, highlighting a misalignment between current metrics and CP's fundamental objectives. By delving into all relevant topics in-depth, this review offers valuable insights into challenges, opportunities, and risks, serving as a reference for advancing research in vehicular collaborative perception.