🤖 AI Summary
Existing bird’s-eye view (BEV) segmentation methods for fisheye cameras suffer from severe geometric distortion, ambiguous cross-view correspondences, and unstable temporal dynamics—leading to degraded performance. To address these challenges, this paper proposes the first fisheye-specific BEV segmentation framework. Methodologically: (1) distortion-adaptive multi-scale feature extraction mitigates radial distortion effects; (2) uncertainty-aware spatial cross-attention (U-SCA) enhances robustness to cross-view misalignment; and (3) duration-aware temporal self-attention (D-TSA) models non-uniform inter-frame dynamics. The framework adopts DRME as its backbone and integrates the above modules with multi-scale uncertainty estimation. Evaluated on the SynWoodScape dataset, our approach significantly outperforms existing state-of-the-art methods in surround-view fisheye BEV segmentation, achieving superior accuracy and temporal stability.
📝 Abstract
As a cornerstone technique for autonomous driving, Bird's Eye View (BEV) segmentation has recently achieved remarkable progress with pinhole cameras. However, it is non-trivial to extend the existing methods to fisheye cameras with severe geometric distortion, ambiguous multi-view correspondences and unstable temporal dynamics, all of which significantly degrade BEV performance. To address these challenges, we propose FishBEV, a novel BEV segmentation framework specifically tailored for fisheye cameras. This framework introduces three complementary innovations, including a Distortion-Resilient Multi-scale Extraction (DRME) backbone that learns robust features under distortion while preserving scale consistency, an Uncertainty-aware Spatial Cross-Attention (U-SCA) mechanism that leverages uncertainty estimation for reliable cross-view alignment, a Distance-aware Temporal Self-Attention (D-TSA) module that adaptively balances near field details and far field context to ensure temporal coherence. Extensive experiments on the Synwoodscapes dataset demonstrate that FishBEV consistently outperforms SOTA baselines, regarding the performance evaluation of FishBEV on the surround-view fisheye BEV segmentation tasks.