Matching High-Dimensional Geometric Quantiles for Test-Time Adaptation of Transformers and Convolutional Networks Alike

πŸ“… 2026-01-16
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the degradation of model generalization under test-time distribution shifts, a challenge often tackled by methods that rely on specific network architectures and thus lack broad applicability. To overcome this limitation, the authors propose an architecture-agnostic test-time adaptation approach that employs a lightweight adapter network to preprocess inputs and, for the first time, introduces high-dimensional geometric quantile matching to correct distributional discrepancies. The method is supported by a quantile-based loss function and a theoretical analysis framework that provides convergence guarantees. Extensive experiments on CIFAR-10/100-C and TinyImageNet-C demonstrate consistent and significant robustness improvements across both convolutional and Transformer-based architectures, confirming the method’s generality and effectiveness.

Technology Category

Application Category

πŸ“ Abstract
Test-time adaptation (TTA) refers to adapting a classifier for the test data when the probability distribution of the test data slightly differs from that of the training data of the model. To the best of our knowledge, most of the existing TTA approaches modify the weights of the classifier relying heavily on the architecture. It is unclear as to how these approaches are extendable to generic architectures. In this article, we propose an architecture-agnostic approach to TTA by adding an adapter network pre-processing the input images suitable to the classifier. This adapter is trained using the proposed quantile loss. Unlike existing approaches, we correct for the distribution shift by matching high-dimensional geometric quantiles. We prove theoretically that under suitable conditions minimizing quantile loss can learn the optimal adapter. We validate our approach on CIFAR10-C, CIFAR100-C and TinyImageNet-C by training both classic convolutional and transformer networks on CIFAR10, CIFAR100 and TinyImageNet datasets.
Problem

Research questions and friction points this paper is trying to address.

test-time adaptation
distribution shift
architecture-agnostic
geometric quantiles
high-dimensional data
Innovation

Methods, ideas, or system contributions that make the work stand out.

test-time adaptation
geometric quantiles
architecture-agnostic
quantile loss
distribution shift
πŸ”Ž Similar Papers
No similar papers found.