Training-Free Out-Of-Distribution Segmentation With Foundation Models

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses zero-shot out-of-distribution (OoD) region detection for semantic segmentation—without any training, fine-tuning, or OoD annotations. The proposed method leverages deep features from a pretrained InternImage-L backbone, models in-distribution feature structure via unsupervised K-Means clustering, and adaptively filters segmentation outputs using confidence scores from the decoder head. By exploiting inherent discriminative capabilities of general-purpose vision foundation models, it establishes the first systematic evidence that such models intrinsically encode OoD detectability. Evaluated on RoadAnomaly and ADE-OoD benchmarks, the approach achieves mean precision of 50.02% and 48.77%, respectively—substantially surpassing both supervised and unsupervised state-of-the-art baselines. This work introduces a lightweight, generalizable, and deployment-ready paradigm for OoD segmentation, eliminating reliance on task-specific training or external supervision.

Technology Category

Application Category

📝 Abstract

Detecting unknown objects in semantic segmentation is crucial for safety-critical applications such as autonomous driving. Large vision foundation models, includ- ing DINOv2, InternImage, and CLIP, have advanced visual representation learn- ing by providing rich features that generalize well across diverse tasks. While their strength in closed-set semantic tasks is established, their capability to detect out- of-distribution (OoD) regions in semantic segmentation remains underexplored. In this work, we investigate whether foundation models fine-tuned on segmen- tation datasets can inherently distinguish in-distribution (ID) from OoD regions without any outlier supervision. We propose a simple, training-free approach that utilizes features from the InternImage backbone and applies K-Means clustering alongside confidence thresholding on raw decoder logits to identify OoD clusters. Our method achieves 50.02 Average Precision on the RoadAnomaly benchmark and 48.77 on the benchmark of ADE-OoD with InternImage-L, surpassing several supervised and unsupervised baselines. These results suggest a promising direc- tion for generic OoD segmentation methods that require minimal assumptions or additional data.

Problem

Research questions and friction points this paper is trying to address.

Detecting unknown objects in semantic segmentation for autonomous driving

Exploring foundation models' capability to identify out-of-distribution regions

Developing training-free OoD segmentation without outlier supervision

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses InternImage backbone features for OoD detection

Applies K-Means clustering to identify outlier regions

Employs confidence thresholding on decoder logits

🔎 Similar Papers

No similar papers found.