Adaptive Debiasing Tsallis Entropy for Test-Time Adaptation

📅 2026-02-12

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the inaccuracy of Shannon entropy–based uncertainty estimation in vision-language models (e.g., CLIP) during test-time adaptation, which stems from distributional biases in pretraining data. To mitigate this, the authors propose Adaptive Debiasing Tsallis Entropy (ADTE), the first method to incorporate Tsallis entropy into test-time adaptation. ADTE dynamically learns a non-extensive parameter $ q^l $ for each class, enabling precise modeling of biased output distributions without introducing additional hyperparameters. By accurately identifying high-confidence samples and integrating them with a label refinement strategy, ADTE achieves state-of-the-art performance on ImageNet and its five distribution-shifted variants, and attains the best average results across ten cross-domain benchmarks. The approach is model-agnostic and compatible with diverse architectures and textual prompts.

Technology Category

Application Category

📝 Abstract

Mainstream Test-Time Adaptation (TTA) methods for adapting vision-language models, e.g., CLIP, typically rely on Shannon Entropy (SE) at test time to measure prediction uncertainty and inconsistency. However, since CLIP has a built-in bias from pretraining on highly imbalanced web-crawled data, SE inevitably results in producing biased estimates of uncertainty entropy. To address this issue, we notably find and demonstrate that Tsallis Entropy (TE), a generalized form of SE, is naturally suited for characterizing biased distributions by introducing a non-extensive parameter q, with the performance of SE serving as a lower bound for TE. Building upon this, we generalize TE into Adaptive Debiasing Tsallis Entropy (ADTE) for TTA, customizing a class-specific parameter q^l derived by normalizing the estimated label bias from continuously incoming test instances, for each category. This adaptive approach allows ADTE to accurately select high-confidence views and seamlessly integrate with a label adjustment strategy to enhance adaptation, without introducing distribution-specific hyperparameter tuning. Besides, our investigation reveals that both TE and ADTE can serve as direct, advanced alternatives to SE in TTA, without any other modifications. Experimental results show that ADTE outperforms state-of-the-art methods on ImageNet and its five variants, and achieves the highest average performance on 10 cross-domain benchmarks, regardless of the model architecture or text prompts used. Our code is available at https://github.com/Jinx630/ADTE.

Problem

Research questions and friction points this paper is trying to address.

Test-Time Adaptation

Shannon Entropy

Bias

Vision-Language Models

Uncertainty Estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Tsallis Entropy

Test-Time Adaptation

Adaptive Debiasing