🤖 AI Summary
Existing foreground-background decomposition methods for few-shot out-of-distribution (FS-OOD) detection suffer from local class-similarity bias and inflexible background patch extraction, limiting robustness. To address these issues, we propose Mambo—a novel framework that introduces semantic-aware background prompt learning to jointly model local background similarity for refined foreground segmentation. Furthermore, Mambo incorporates a patch self-calibration strategy that dynamically optimizes both the number and spatial locations of background patches, breaking away from static extraction paradigms. This design preserves few-shot adaptability while significantly improving background modeling fidelity. Extensive experiments on multiple real-world benchmarks demonstrate that Mambo achieves state-of-the-art performance on both general and near-OOD detection tasks. Notably, it excels in fine-grained discrimination and exhibits superior robustness to input noise and domain shifts.
📝 Abstract
Existing foreground-background (FG-BG) decomposition methods for the few-shot out-of-distribution (FS-OOD) detection often suffer from low robustness due to over-reliance on the local class similarity and a fixed background patch extraction strategy. To address these challenges, we propose a new FG-BG decomposition framework, namely Mambo, for FS-OOD detection. Specifically, we propose to first learn a background prompt to obtain the local background similarity containing both the background and image semantic information, and then refine the local background similarity using the local class similarity. As a result, we use both the refined local background similarity and the local class similarity to conduct background extraction, reducing the dependence of the local class similarity in previous methods. Furthermore, we propose the patch self-calibrated tuning to consider the sample diversity to flexibly select numbers of background patches for different samples, and thus exploring the issue of fixed background extraction strategies in previous methods. Extensive experiments on real-world datasets demonstrate that our proposed Mambo achieves the best performance, compared to SOTA methods in terms of OOD detection and near OOD detection setting. The source code will be released at https://github.com/YuzunoKawori/Mambo.