🤖 AI Summary
Existing test-time adaptation (TTA) methods assume static and homogeneous target domains, rendering them inadequate for real-world scenarios where single- or multi-domain distributions evolve dynamically over time—leading to performance degradation. To address this, we propose Diverse-TTA, a diversity-aware TTA framework featuring a novel dynamic adaptation mechanism guided by a diversity score. Our approach decouples normalization and fine-tuning strategy selection: (i) a Diversity Discriminator (DD) module quantifies batch-level distribution complexity; (ii) Diversity-Aware Batch Normalization (DABN) enables conditional switching between InstanceNorm and BatchNorm; and (iii) Diversity-Aware Fine-Tuning (DAFT) performs selective parameter updates. Evaluated across multiple benchmarks, Diverse-TTA achieves up to 21% higher accuracy than state-of-the-art methods, significantly improving model robustness, generalization, and stability under both high- and low-diversity dynamic distribution shifts.
📝 Abstract
Test-time adaptation (TTA) effectively addresses distribution shifts between training and testing data by adjusting models on test samples, which is crucial for improving model inference in real-world applications. However, traditional TTA methods typically follow a fixed pattern to address the dynamic data patterns (low-diversity or high-diversity patterns) often leading to performance degradation and consequently a decline in Quality of Experience (QoE). The primary issues we observed are:Different scenarios require different normalization methods (e.g., Instance Normalization is optimal in mixed domains but not in static domains). Model fine-tuning can potentially harm the model and waste time.Hence, it is crucial to design strategies for effectively measuring and managing distribution diversity to minimize its negative impact on model performance. Based on these observations, this paper proposes a new general method, named Diversity Adaptive Test-Time Adaptation (DATTA), aimed at improving QoE. DATTA dynamically selects the best batch normalization methods and fine-tuning strategies by leveraging the Diversity Score to differentiate between high and low diversity score batches. It features three key components: Diversity Discrimination (DD) to assess batch diversity, Diversity Adaptive Batch Normalization (DABN) to tailor normalization methods based on DD insights, and Diversity Adaptive Fine-Tuning (DAFT) to selectively fine-tune the model. Experimental results show that our method achieves up to a 21% increase in accuracy compared to state-of-the-art methodologies, indicating that our method maintains good model performance while demonstrating its robustness. Our code will be released soon.