🤖 AI Summary
This work addresses the domain shift problem in object detection caused by variations in weather, lighting, or scene conditions when models are trained on a single source domain and evaluated on unknown target domains. To tackle this challenge, the authors propose Cross-Domain Feature Knowledge Distillation (CD-FKD), a novel approach that jointly leverages global and instance-level feature distillation for single-domain generalization. In CD-FKD, a teacher network processes original source-domain data, while a student network is trained on diversified inputs generated through downsampling and image corruption augmentations, thereby establishing an effective pathway for cross-domain knowledge transfer. Experimental results demonstrate that CD-FKD significantly outperforms existing methods across multiple cross-domain benchmarks, achieving competitive performance on the source domain while substantially improving generalization and robustness on unseen target domains.
📝 Abstract
Single-domain generalization is essential for object detection, particularly when training models on a single source domain and evaluating them on unseen target domains. Domain shifts, such as changes in weather, lighting, or scene conditions, pose significant challenges to the generalization ability of existing models. To address this, we propose Cross-Domain Feature Knowledge Distillation (CD-FKD), which enhances the generalization capability of the student network by leveraging both global and instance-wise feature distillation. The proposed method uses diversified data through downscaling and corruption to train the student network, whereas the teacher network receives the original source domain data. The student network mimics the features of the teacher through both global and instance-wise distillation, enabling it to extract object-centric features effectively, even for objects that are difficult to detect owing to corruption. Extensive experiments on challenging scenes demonstrate that CD-FKD outperforms state-of-the-art methods in both target domain generalization and source domain performance, validating its effectiveness in improving object detection robustness to domain shifts. This approach is valuable in real-world applications, like autonomous driving and surveillance, where robust object detection in diverse environments is crucial.