🤖 AI Summary
To address weak classifier adaptability under test-time distribution shifts and degraded calibration after self-training, this paper proposes an online entropy distribution matching framework. Methodologically, it employs a betting-type martingale detector to monitor statistical shifts in the predictive entropy stream in real time, and leverages optimal transport theory to formulate a self-supervised loss that dynamically aligns the test-time entropy distribution with the source-domain entropy distribution. Unlike conventional entropy minimization, our approach explicitly models and matches the full entropy distribution—thereby jointly enhancing robustness to distribution shifts and preserving model calibration in the absence of shifts. Extensive experiments on diverse distribution shift benchmarks demonstrate significant improvements in test accuracy, while maintaining source-domain accuracy and reliability. Our method consistently outperforms existing entropy minimization and online self-training approaches across all metrics.
📝 Abstract
We present a novel approach for test-time adaptation via online self-training, consisting of two components. First, we introduce a statistical framework that detects distribution shifts in the classifier's entropy values obtained on a stream of unlabeled samples. Second, we devise an online adaptation mechanism that utilizes the evidence of distribution shifts captured by the detection tool to dynamically update the classifier's parameters. The resulting adaptation process drives the distribution of test entropy values obtained from the self-trained classifier to match those of the source domain, building invariance to distribution shifts. This approach departs from the conventional self-training method, which focuses on minimizing the classifier's entropy. Our approach combines concepts in betting martingales and online learning to form a detection tool capable of quickly reacting to distribution shifts. We then reveal a tight relation between our adaptation scheme and optimal transport, which forms the basis of our novel self-supervised loss. Experimental results demonstrate that our approach improves test-time accuracy under distribution shifts while maintaining accuracy and calibration in their absence, outperforming leading entropy minimization methods across various scenarios.