When Small Guides Large: Cross-Model Co-Learning for Test-Time Adaptation

📅 2025-06-30

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the limitations of single-model test-time adaptation (TTA) by systematically investigating cross-scale knowledge collaboration for the first time. We propose COCA, a framework that enables online guidance from a lightweight model to a larger one, integrating co-adaptation—via cross-model feature alignment and momentum-based contrastive learning—with self-adaptation—through gradient-coupled personalized optimization. COCA supports heterogeneous architectures—including ResNet, ViT, and MobileViT—out-of-the-box, requiring neither additional annotations nor offline training. On ImageNet-C, ViT-Base achieves an average accuracy of 64.5%, up from 51.7%, substantially surpassing existing state-of-the-art methods. Our core contributions are: (i) revealing the efficacy of small models in guiding large models during TTA; and (ii) establishing the first scalable, architecture-agnostic multi-model collaborative TTA paradigm.

Technology Category

Application Category

📝 Abstract

Test-time Adaptation (TTA) adapts a given model to testing domain data with potential domain shifts through online unsupervised learning, yielding impressive performance. However, to date, existing TTA methods primarily focus on single-model adaptation. In this work, we investigate an intriguing question: how does cross-model knowledge influence the TTA process? Our findings reveal that, in TTA's unsupervised online setting, each model can provide complementary, confident knowledge to the others, even when there are substantial differences in model size. For instance, a smaller model like MobileViT (10.6M parameters) can effectively guide a larger model like ViT-Base (86.6M parameters). In light of this, we propose COCA, a Cross-Model Co-Learning framework for TTA, which mainly consists of two main strategies. 1) Co-adaptation adaptively integrates complementary knowledge from other models throughout the TTA process, reducing individual model biases. 2) Self-adaptation enhances each model's unique strengths via unsupervised learning, enabling diverse adaptation to the target domain. Extensive experiments show that COCA, which can also serve as a plug-and-play module, significantly boosts existing SOTAs, on models with various sizes--including ResNets, ViTs, and Mobile-ViTs--via cross-model co-learned TTA. For example, with Mobile-ViT's guidance, COCA raises ViT-Base's average adaptation accuracy on ImageNet-C from 51.7% to 64.5%. The code is publicly available at https://github.com/ycarobot/COCA.

Problem

Research questions and friction points this paper is trying to address.

Explores cross-model knowledge impact on Test-time Adaptation (TTA)

Proposes COCA framework for complementary model co-learning

Enhances adaptation accuracy across diverse model sizes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-model co-learning for test-time adaptation

Co-adaptation integrates complementary model knowledge

Self-adaptation enhances unique model strengths

🔎 Similar Papers

No similar papers found.

Authors to Follow