๐ค AI Summary
Existing test-time adaptation (TTA) methods rely on updating normalization layers, but batch statistics become unstable under small batch sizes, and pretrained statistics fail to generalize to unseen domainsโcausing severe performance degradation under large domain shifts. This paper proposes a novel buffer-layer-based TTA paradigm: without modifying the pretrained backbone, lightweight, plug-and-play buffer layers replace normalization layers to perform online feature adjustment, thereby avoiding both reliance on batch statistics and catastrophic forgetting induced by parameter updates. The approach is compatible with mainstream TTA frameworks and achieves significant improvements over baselines across diverse architectures and challenging domain-shift scenarios. It demonstrates superior robustness, generalization capability, and resistance to forgetting. Code is publicly available.
๐ Abstract
In recent advancements in Test Time Adaptation (TTA), most existing methodologies focus on updating normalization layers to adapt to the test domain. However, the reliance on normalization-based adaptation presents key challenges. First, normalization layers such as Batch Normalization (BN) are highly sensitive to small batch sizes, leading to unstable and inaccurate statistics. Moreover, normalization-based adaptation is inherently constrained by the structure of the pre-trained model, as it relies on training-time statistics that may not generalize well to unseen domains. These issues limit the effectiveness of normalization-based TTA approaches, especially under significant domain shift. In this paper, we introduce a novel paradigm based on the concept of a Buffer layer, which addresses the fundamental limitations of normalization layer updates. Unlike existing methods that modify the core parameters of the model, our approach preserves the integrity of the pre-trained backbone, inherently mitigating the risk of catastrophic forgetting during online adaptation. Through comprehensive experimentation, we demonstrate that our approach not only outperforms traditional methods in mitigating domain shift and enhancing model robustness, but also exhibits strong resilience to forgetting. Furthermore, our Buffer layer is modular and can be seamlessly integrated into nearly all existing TTA frameworks, resulting in consistent performance improvements across various architectures. These findings validate the effectiveness and versatility of the proposed solution in real-world domain adaptation scenarios. The code is available at https://github.com/hyeongyu-kim/Buffer_TTA.