Prior2Former -- Evidential Modeling of Mask Transformers for Assumption-Free Open-World Panoptic Segmentation

📅 2025-04-07

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Open-world panoptic segmentation is constrained by predefined categories, limiting reliable identification of unknown classes and out-of-distribution (OOD) data—hindering deployment in safety-critical applications such as autonomous driving. To address this, we propose the first framework integrating evidential deep learning into mask Transformers. Our method introduces a learnable Beta prior to enable hypothesis-agnostic, pixel-wise uncertainty quantification over binary instance masks, without requiring OOD samples, empty-class supervision, or contrastive training. It performs end-to-end anomaly instance detection and panoptic segmentation within the Mask Transformer architecture. Evaluated on Cityscapes, COCO, SegmentMeIfYouCan, and the OoDIS benchmark, our approach achieves state-of-the-art performance. Notably, under zero-shot OOD conditions—i.e., without any OOD training data—it ranks first on OoDIS for anomaly instance segmentation.

Technology Category

Application Category

📝 Abstract

In panoptic segmentation, individual instances must be separated within semantic classes. As state-of-the-art methods rely on a pre-defined set of classes, they struggle with novel categories and out-of-distribution (OOD) data. This is particularly problematic in safety-critical applications, such as autonomous driving, where reliability in unseen scenarios is essential. We address the gap between outstanding benchmark performance and reliability by proposing Prior2Former (P2F), the first approach for segmentation vision transformers rooted in evidential learning. P2F extends the mask vision transformer architecture by incorporating a Beta prior for computing model uncertainty in pixel-wise binary mask assignments. This design enables high-quality uncertainty estimation that effectively detects novel and OOD objects enabling state-of-the-art anomaly instance segmentation and open-world panoptic segmentation. Unlike most segmentation models addressing unknown classes, P2F operates without access to OOD data samples or contrastive training on void (i.e., unlabeled) classes, making it highly applicable in real-world scenarios where such prior information is unavailable. Additionally, P2F can be flexibly applied to anomaly instance and panoptic segmentation. Through comprehensive experiments on the Cityscapes, COCO, SegmentMeIfYouCan, and OoDIS datasets, we demonstrate the state-of-the-art performance of P2F. It achieves the highest ranking in the OoDIS anomaly instance benchmark among methods not using OOD data in any way.

Problem

Research questions and friction points this paper is trying to address.

Addresses novel categories and OOD data in panoptic segmentation

Enables high-quality uncertainty estimation for unseen scenarios

Operates without OOD data or contrastive training on void classes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evidential learning for mask transformers

Beta prior for pixel-wise uncertainty

No OOD data or contrastive training needed

🔎 Similar Papers

No similar papers found.