Multi-modality Anomaly Segmentation on the Road

📅 2025-03-22

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

In autonomous driving, single-modality anomaly segmentation models suffer from high false-positive rates due to excessive anomaly scores on non-anomalous regions. To address this, we propose MMRAS+, the first multimodal (image + text) uncertainty-aware anomaly segmentation framework tailored for road scenes. Methodologically, MMRAS+ introduces a novel multimodal uncertainty modeling mechanism that integrates the CLIP text encoder to enable cross-modal alignment between visual and semantic textual features while quantifying predictive uncertainty. A lightweight ensemble module further enhances robustness. Evaluated on RoadAnomaly, SMIYC, and Fishyscapes, MMRAS+ significantly outperforms state-of-the-art single-modality methods, effectively suppressing false-positive responses on non-anomalous categories. The source code is publicly available.

Technology Category

Application Category

📝 Abstract

Semantic segmentation allows autonomous driving cars to understand the surroundings of the vehicle comprehensively. However, it is also crucial for the model to detect obstacles that may jeopardize the safety of autonomous driving systems. Based on our experiments, we find that current uni-modal anomaly segmentation frameworks tend to produce high anomaly scores for non-anomalous regions in images. Motivated by this empirical finding, we develop a multi-modal uncertainty-based anomaly segmentation framework, named MMRAS+, for autonomous driving systems. MMRAS+ effectively reduces the high anomaly outputs of non-anomalous classes by introducing text-modal using the CLIP text encoder. Indeed, MMRAS+ is the first multi-modal anomaly segmentation solution for autonomous driving. Moreover, we develop an ensemble module to further boost the anomaly segmentation performance. Experiments on RoadAnomaly, SMIYC, and Fishyscapes validation datasets demonstrate the superior performance of our method. The code is available in https://github.com/HengGao12/MMRAS_plus.

Problem

Research questions and friction points this paper is trying to address.

Detects road anomalies for autonomous driving safety

Reduces false anomaly scores in non-anomalous regions

Introduces multi-modal CLIP text encoder for segmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal uncertainty-based anomaly segmentation framework

Introduces text-modal using CLIP text encoder

Develops ensemble module for performance boost

🔎 Similar Papers

Hybrid Video Anomaly Detection for Anomalous Scenarios in Autonomous Driving