FACTOR: Counterfactual Training-Free Test-Time Adaptation for Open-Vocabulary Object Detection

📅 2026-05-04

📈 Citations: 0

✨ Influential: 0

career value

154K/year

🤖 AI Summary

This work addresses the vulnerability of open-vocabulary object detection to spurious correlations between non-causal visual attributes—such as brightness and texture—and object categories under distribution shifts. To mitigate this issue, the paper introduces the first training-free test-time adaptation framework for this task, incorporating explicit counterfactual reasoning. Specifically, it generates counterfactual views of test images by perturbing non-causal attributes and compares region-level predictions between original and counterfactual views to quantify attribute sensitivity. Based on this sensitivity, the method selectively suppresses unreliable predictions without requiring online optimization, enabling attribute-specific correction. Experiments demonstrate that the proposed approach significantly outperforms existing test-time adaptation methods on PASCAL-C, COCO-C, and FoggyCityscapes, substantially improving model robustness under distribution shifts.

📝 Abstract

Open-vocabulary object detection often fails under distribution shifts, as it can be misled by spurious correlations between non-causal visual attributes (e.g., brightness, texture) and object categories. Existing test-time adaptation (TTA) methods either depend on costly online optimization or perform global calibration, overlooking the attribute-specific nature of these failures. To address this, we propose FACTOR (counterFACtual training-free Test-time adaptation for Open-vocabulaRy object detection), a lightweight framework grounded in counterfactual reasoning. By perturbing test images along non-causal attributes and comparing region-level predictions between original and counterfactual views, FACTOR quantifies attribute sensitivity, semantic relevance, and prediction variation to selectively suppress attribute-dependent predictions-without parameter updates. Experiments on PASCAL-C, COCO-C, and FoggyCityscapes show that FACTOR consistently outperforms prior TTA methods, demonstrating that explicit counterfactual reasoning effectively improves robustness under distribution shifts.

Problem

Research questions and friction points this paper is trying to address.

open-vocabulary object detection

distribution shift

spurious correlations

non-causal attributes

test-time adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

counterfactual reasoning

test-time adaptation

open-vocabulary object detection