π€ AI Summary
This work addresses the significant performance degradation of open-vocabulary object detection models when deployed in unlabeled target domains such as nighttime or foggy conditions. To tackle this challenge, the authors propose ABRA, a novel approach that formulates cross-domain knowledge transfer as a geometric transport problem in the weight space of a pretrained detector. By leveraging geometric alignment guided by domain experts, ABRA enables category-level knowledge transfer without requiring annotations in the target domain. The method is readily adaptable to open-vocabulary detection frameworks like Grounding DINO and demonstrates substantial performance gains across diverse and challenging domain-shift scenarios, effectively achieving cross-domain transfer of category-specific knowledge under adverse environmental conditions.
π Abstract
Although recent Open-Vocabulary Object Detection architectures, such as Grounding DINO, demonstrate strong zero-shot capabilities, their performance degrades significantly under domain shifts. Moreover, many domains of practical interest, such as nighttime or foggy scenes, lack large annotated datasets, preventing direct fine-tuning. In this paper, we introduce Aligned Basis Relocation for Adaptation(ABRA), a method that transfers class-specific detection knowledge from a labeled source domain to a target domain where no training images containing these classes are accessible. ABRA formulates this adaptation as a geometric transport problem in the weight space of a pretrained detector, aligning source and target domain experts to transport class-specific knowledge. Extensive experiments across challenging domain shifts demonstrate that ABRA successfully teleports class-level specialization under multiple adverse conditions. Our code will be made public upon acceptance.