Extremely Simple Multimodal Outlier Synthesis for Out-of-Distribution Detection and Segmentation

๐Ÿ“… 2025-05-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing single-modality out-of-distribution (OOD) detection methods struggle with pixel-level OOD detection and segmentation in safety-critical multimodal scenarios (e.g., autonomous driving, surgical robotics), particularly under unknown distributions across modalities (e.g., image + LiDAR), and suffer from pervasive OOD overconfidence. This work proposes Feature Mixingโ€”an unsupervised, modality-agnostic method that theoretically guarantees synthesis of multimodal anomalous features. We introduce CARLA-OOD, the first multimodal semantic segmentation benchmark containing synthetically generated OOD objects. Our framework jointly leverages multimodal feature fusion, OOD confidence calibration, and cross-modal consistency constraints. Evaluated on SemanticKITTI, nuScenes, CARLA-OOD, and MultiOOD, it achieves state-of-the-art performance while accelerating inference by 10ร—โ€“370ร— and significantly mitigating OOD overconfidence.

Technology Category

Application Category

๐Ÿ“ Abstract
Out-of-distribution (OOD) detection and segmentation are crucial for deploying machine learning models in safety-critical applications such as autonomous driving and robot-assisted surgery. While prior research has primarily focused on unimodal image data, real-world applications are inherently multimodal, requiring the integration of multiple modalities for improved OOD detection. A key challenge is the lack of supervision signals from unknown data, leading to overconfident predictions on OOD samples. To address this challenge, we propose Feature Mixing, an extremely simple and fast method for multimodal outlier synthesis with theoretical support, which can be further optimized to help the model better distinguish between in-distribution (ID) and OOD data. Feature Mixing is modality-agnostic and applicable to various modality combinations. Additionally, we introduce CARLA-OOD, a novel multimodal dataset for OOD segmentation, featuring synthetic OOD objects across diverse scenes and weather conditions. Extensive experiments on SemanticKITTI, nuScenes, CARLA-OOD datasets, and the MultiOOD benchmark demonstrate that Feature Mixing achieves state-of-the-art performance with a $10 imes$ to $370 imes$ speedup. Our source code and dataset will be available at https://github.com/mona4399/FeatureMixing.
Problem

Research questions and friction points this paper is trying to address.

Multimodal outlier synthesis for OOD detection and segmentation
Lack of supervision signals from unknown OOD data
Need for modality-agnostic OOD detection in real-world applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Feature Mixing for multimodal outlier synthesis
Modality-agnostic approach for various combinations
Introduces CARLA-OOD dataset for segmentation
๐Ÿ”Ž Similar Papers
No similar papers found.