🤖 AI Summary
To address the weak zero-shot generalization capability of vehicle re-identification (ReID) models on unseen target domains, this paper proposes a source-free, two-stage multi-expert knowledge adversarial and collaborative framework. First, a Spectral Transform-based Redundancy Elimination and Enhancement Module (STREAM) is designed to suppress domain-specific redundant information in source-domain images within the frequency domain. Second, a CLIP semantic prompt-guided multi-expert adversarial collaboration mechanism is introduced to stimulate fine-grained, complementary feature modeling. The method integrates spectral-domain transformation, prompt-driven feature fusion, and multi-expert contrastive learning, thereby overcoming limitations of conventional domain-invariant feature modeling. Extensive experiments demonstrate state-of-the-art performance across multiple cross-domain vehicle ReID benchmarks, achieving significant improvements in zero-shot identification accuracy.
📝 Abstract
Generalizable vehicle re-identification (ReID) seeks to develop models that can adapt to unknown target domains without the need for additional fine-tuning or retraining. Previous works have mainly focused on extracting domain-invariant features by aligning data distributions between source domains. However, interfered by the inherent domain-related redundancy in the source images, solely relying on common features is insufficient for accurately capturing the complementary features with lower occurrence probability and smaller energy. To solve this unique problem, we propose a two-stage Multi-expert Knowledge Confrontation and Collaboration (MiKeCoCo) method, which fully leverages the high-level semantics of Contrastive Language-Image Pretraining (CLIP) to obtain a diversified prompt set and achieve complementary feature representations. Specifically, this paper first designs a Spectrum-based Transformation for Redundancy Elimination and Augmentation Module (STREAM) through simple image preprocessing to obtain two types of image inputs for the training process. Since STREAM eliminates domain-related redundancy in source images, it enables the model to pay closer attention to the detailed prompt set that is crucial for distinguishing fine-grained vehicles. This learned prompt set related to the vehicle identity is then utilized to guide the comprehensive representation learning of complementary features for final knowledge fusion and identity recognition. Inspired by the unity principle, MiKeCoCo integrates the diverse evaluation ways of experts to ensure the accuracy and consistency of ReID. Extensive experimental results demonstrate that our method achieves state-of-the-art performance.