WiSE-OD: Benchmarking Robustness in Infrared Object Detection

📅 2025-07-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Infrared object detection models suffer from insufficient robustness under cross-modal (RGB→infrared) transfer and distribution shifts. To address this, we propose WiSE-OD, a Weight-space Integration for Robust Object Detection framework that—uniquely—fuses zero-shot and fine-tuned model weights directly in parameter space, without additional training or inference overhead, thereby jointly improving both accuracy and robustness. For systematic evaluation, we introduce two novel image-level corruption benchmarks: LLVIP-C and FLIR-C. Extensive experiments—including out-of-distribution cross-modal evaluation, linear probing, and multi-architecture validation (Faster R-CNN, YOLOv5, DETR)—demonstrate that WiSE-OD significantly enhances robustness against diverse corruptions (e.g., noise, blur, weather artifacts) and domain shifts, while preserving original task accuracy.

Technology Category

Application Category

📝 Abstract
Object detection (OD) in infrared (IR) imagery is critical for low-light and nighttime applications. However, the scarcity of large-scale IR datasets forces models to rely on weights pre-trained on RGB images. While fine-tuning on IR improves accuracy, it often compromises robustness under distribution shifts due to the inherent modality gap between RGB and IR. To address this, we introduce LLVIP-C and FLIR-C, two cross-modality out-of-distribution (OOD) benchmarks built by applying corruption to standard IR datasets. Additionally, to fully leverage the complementary knowledge from RGB and infrared trained models, we propose WiSE-OD, a weight-space ensembling method with two variants: WiSE-OD$_{ZS}$, which combines RGB zero-shot and IR fine-tuned weights, and WiSE-OD$_{LP}$, which blends zero-shot and linear probing. Evaluated across three RGB-pretrained detectors and two robust baselines, WiSE-OD improves both cross-modality and corruption robustness without any additional training or inference cost.
Problem

Research questions and friction points this paper is trying to address.

Addressing robustness gaps in infrared object detection under distribution shifts
Bridging modality differences between RGB and infrared image datasets
Enhancing cross-modality robustness without extra training or inference costs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces LLVIP-C and FLIR-C OOD benchmarks
Proposes WiSE-OD weight-space ensembling method
Enhances robustness without extra training cost
🔎 Similar Papers
No similar papers found.