MC-BEVRO: Multi-Camera Bird Eye View Road Occupancy Detection for Traffic Monitoring

📅 2025-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address inaccurate right-of-way detection in traffic monitoring caused by occlusion and limited field-of-view in single-camera 3D perception, this paper proposes a multi-roadside-camera collaborative BEV (Bird’s Eye View) road occupancy detection framework. Methodologically, it introduces the first late- and early-fusion paradigms for multi-camera BEV occupancy estimation; constructs a synthetic dataset using CARLA to enable sim-to-real zero-shot and few-shot transfer; and incorporates multi-view feature alignment, BEV-space projection, and background-enhanced fusion. Contributions include: (1) substantial accuracy improvement—multi-camera input boosts mAP by over 18% versus single-camera baselines; (2) empirical validation of BEV resolution robustness; and (3) practical-level performance under zero-shot transfer, with further significant gains after few-shot fine-tuning. The framework balances deployment feasibility and compatibility with existing traffic infrastructure.

Technology Category

Application Category

📝 Abstract
Single camera 3D perception for traffic monitoring faces significant challenges due to occlusion and limited field of view. Moreover, fusing information from multiple cameras at the image feature level is difficult because of different view angles. Further, the necessity for practical implementation and compatibility with existing traffic infrastructure compounds these challenges. To address these issues, this paper introduces a novel Bird's-Eye-View road occupancy detection framework that leverages multiple roadside cameras to overcome the aforementioned limitations. To facilitate the framework's development and evaluation, a synthetic dataset featuring diverse scenes and varying camera configurations is generated using the CARLA simulator. A late fusion and three early fusion methods were implemented within the proposed framework, with performance further enhanced by integrating backgrounds. Extensive evaluations were conducted to analyze the impact of multi-camera inputs and varying BEV occupancy map sizes on model performance. Additionally, a real-world data collection pipeline was developed to assess the model's ability to generalize to real-world environments. The sim-to-real capabilities of the model were evaluated using zero-shot and few-shot fine-tuning, demonstrating its potential for practical application. This research aims to advance perception systems in traffic monitoring, contributing to improved traffic management, operational efficiency, and road safety.
Problem

Research questions and friction points this paper is trying to address.

Multi-camera occlusion challenges
Bird's-Eye-View road occupancy detection
Sim-to-real generalization evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-camera BEV road occupancy detection
Synthetic dataset generation with CARLA
Sim-to-real evaluation with fine-tuning
🔎 Similar Papers
No similar papers found.