🤖 AI Summary
To address the insufficient robustness of test-time adaptation (TTA) to out-of-distribution (OOD) samples in open-world settings—particularly its tendency to misclassify unknown classes as known—the paper proposes a hierarchical feature-driven TTA framework. Methodologically: (1) a Hierarchical Ladder Network integrates multi-level Transformer class tokens for fine-grained OOD detection; (2) an Attention Affine Network dynamically recalibrates self-attention weights to adapt to domain shifts; and (3) a weighted entropy mechanism suppresses interference from low-confidence predictions. Evaluated on multiple benchmarks, the approach significantly improves in-distribution (ID) classification accuracy and OOD detection performance while ensuring robust prediction for both ID and OOD samples. This work establishes a scalable and highly stable new paradigm for open-world TTA.
📝 Abstract
Test-time adaptation (TTA) refers to adjusting the model during the testing phase to cope with changes in sample distribution and enhance the model's adaptability to new environments. In real-world scenarios, models often encounter samples from unseen (out-of-distribution, OOD) categories. Misclassifying these as known (in-distribution, ID) classes not only degrades predictive accuracy but can also impair the adaptation process, leading to further errors on subsequent ID samples. Many existing TTA methods suffer substantial performance drops under such conditions. To address this challenge, we propose a Hierarchical Ladder Network that extracts OOD features from class tokens aggregated across all Transformer layers. OOD detection performance is enhanced by combining the original model prediction with the output of the Hierarchical Ladder Network (HLN) via weighted probability fusion. To improve robustness under domain shift, we further introduce an Attention Affine Network (AAN) that adaptively refines the self-attention mechanism conditioned on the token information to better adapt to domain drift, thereby improving the classification performance of the model on datasets with domain shift. Additionally, a weighted entropy mechanism is employed to dynamically suppress the influence of low-confidence samples during adaptation. Experimental results on benchmark datasets show that our method significantly improves the performance on the most widely used classification datasets.