Progressive Bird's Eye View Perception for Safety-Critical Autonomous Driving: A Comprehensive Survey

📅 2025-08-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
BEV perception faces reliability bottlenecks in safety-critical scenarios—such as occlusion, adverse weather, and dynamic traffic—hindering its deployment in autonomous driving. Method: This paper presents the first systematic survey of BEV perception evolution from a functional safety perspective, categorizing advancements into three phases: unimodal, multimodal onboard, and multi-agent collaborative perception. It proposes a unified taxonomy for open-world challenges, identifying core issues including sensor degradation, unknown-class recognition, and low-latency coordination. The survey integrates multi-sensor fusion, open-set recognition, label-free learning, degradation-resilient modeling, and vehicle–road–cloud low-latency communication, while reviewing mainstream architectures and benchmark datasets. Contribution/Results: It clarifies critical technical bottlenecks and provides theoretical foundations and practical pathways for advancing BEV perception toward end-to-end autonomous driving, embodied intelligence, and large-model-augmented systems.

Technology Category

Application Category

📝 Abstract
Bird's-Eye-View (BEV) perception has become a foundational paradigm in autonomous driving, enabling unified spatial representations that support robust multi-sensor fusion and multi-agent collaboration. As autonomous vehicles transition from controlled environments to real-world deployment, ensuring the safety and reliability of BEV perception in complex scenarios - such as occlusions, adverse weather, and dynamic traffic - remains a critical challenge. This survey provides the first comprehensive review of BEV perception from a safety-critical perspective, systematically analyzing state-of-the-art frameworks and implementation strategies across three progressive stages: single-modality vehicle-side, multimodal vehicle-side, and multi-agent collaborative perception. Furthermore, we examine public datasets encompassing vehicle-side, roadside, and collaborative settings, evaluating their relevance to safety and robustness. We also identify key open-world challenges - including open-set recognition, large-scale unlabeled data, sensor degradation, and inter-agent communication latency - and outline future research directions, such as integration with end-to-end autonomous driving systems, embodied intelligence, and large language models.
Problem

Research questions and friction points this paper is trying to address.

Ensuring BEV perception safety in complex driving scenarios
Reviewing BEV frameworks for multi-sensor and multi-agent systems
Addressing open-world challenges like sensor degradation and latency
Innovation

Methods, ideas, or system contributions that make the work stand out.

BEV perception for multi-sensor fusion
Multimodal vehicle-side perception frameworks
Multi-agent collaborative perception strategies
🔎 Similar Papers
No similar papers found.
Y
Yan Gong
State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, 150001, China.
N
Naibang Wang
State Key Laboratory of Intelligent Green Vehicle and Mobility, the School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China.
J
Jianli Lu
State Key Laboratory of Intelligent Green Vehicle and Mobility, the School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China.
X
Xinyu Zhang
State Key Laboratory of Intelligent Green Vehicle and Mobility, the School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China.
Y
Yongsheng Gao
State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, 150001, China.
J
Jie Zhao
State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, 150001, China.
Z
Zifan Huang
State Key Laboratory of Intelligent Green Vehicle and Mobility, the School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China.
H
Haozhi Bai
State Key Laboratory of Intelligent Green Vehicle and Mobility, the School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China.
N
Nanxin Zeng
State Key Laboratory of Intelligent Green Vehicle and Mobility, the School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China.
N
Nayu Su
State Key Laboratory of Intelligent Green Vehicle and Mobility, the School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China.
L
Lei Yang
the School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore.
Ziying Song
Ziying Song
Beijing Jiaotong University
Object DetectionComputer VisionDeep Learning
X
Xiaoxi Hu
State Key Laboratory of Intelligent Green Vehicle and Mobility, the School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China.
X
Xinmin Jiang
State Key Laboratory of Intelligent Green Vehicle and Mobility, the School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China.
X
Xiaojuan Zhang
the Institute for Infocomm Research, A*STAR, Singapore.
S
Susanto Rahardja
the Engineering Cluster, the Singapore Institute of Technology, Singapore.