🤖 AI Summary
To address the challenge of real-time monocular geometric-semantic joint understanding in autonomous driving, this paper proposes a tightly coupled unified framework for panoptic segmentation and self-supervised depth estimation. Methodologically: (1) we introduce a novel panoptic-guided motion masking mechanism to suppress dynamic object interference in depth estimation; (2) we design an RT-K-Net-based tightly coupled depth predictor that explicitly fuses panoptic segmentation features without requiring video-level panoptic annotations; (3) we adopt link-kernel modeling and multi-task joint training. Evaluated on Cityscapes and KITTI, our method achieves real-time state-of-the-art performance—reducing depth error by 12.3% and improving panoptic segmentation mAP by 4.1%—while approaching the accuracy of non-real-time dense models. The framework significantly enhances robustness and practicality in complex urban scenarios.
📝 Abstract
Monocular geometric scene understanding combines panoptic segmentation and self-supervised depth estimation, focusing on real-time application in autonomous vehicles. We introduce MGNiceNet, a unified approach that uses a linked kernel formulation for panoptic segmentation and self-supervised depth estimation. MGNiceNet is based on the state-of-the-art real-time panoptic segmentation method RT-K-Net and extends the architecture to cover both panoptic segmentation and self-supervised monocular depth estimation. To this end, we introduce a tightly coupled self-supervised depth estimation predictor that explicitly uses information from the panoptic path for depth prediction. Furthermore, we introduce a panoptic-guided motion masking method to improve depth estimation without relying on video panoptic segmentation annotations. We evaluate our method on two popular autonomous driving datasets, Cityscapes and KITTI. Our model shows state-of-the-art results compared to other real-time methods and closes the gap to computationally more demanding methods. Source code and trained models are available at https://github.com/markusschoen/MGNiceNet.