Test-Time Visual In-Context Tuning

📅 2025-03-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing vision-in-context learning (VICL) methods exhibit poor generalization under distribution shifts and struggle to adapt to novel tasks or degraded inputs. To address this, we propose Vision-in-Context Testing-time Tuning (VICT), a parameter-free, gradient-free, online adaptation framework that operates with only a single test sample. VICT innovatively swaps the roles of the test sample and the task prompt and introduces a cycle-consistency reconstruction loss to recover the original prompt’s output—without updating any model parameters. This enables zero-shot cross-task transfer. Extensive experiments across six vision tasks and fifteen image degradation types demonstrate that VICT significantly outperforms existing VICL methods in cross-domain generalization and test-time adaptation to unseen tasks. Notably, VICT achieves prompt-level test-time adaptation without fine-tuning—marking the first such capability in the VICL literature.

Technology Category

Application Category

📝 Abstract
Visual in-context learning (VICL), as a new paradigm in computer vision, allows the model to rapidly adapt to various tasks with only a handful of prompts and examples. While effective, the existing VICL paradigm exhibits poor generalizability under distribution shifts. In this work, we propose test-time Visual In-Context Tuning (VICT), a method that can adapt VICL models on the fly with a single test sample. Specifically, we flip the role between the task prompts and the test sample and use a cycle consistency loss to reconstruct the original task prompt output. Our key insight is that a model should be aware of a new test distribution if it can successfully recover the original task prompts. Extensive experiments on six representative vision tasks ranging from high-level visual understanding to low-level image processing, with 15 common corruptions, demonstrate that our VICT can improve the generalizability of VICL to unseen new domains. In addition, we show the potential of applying VICT for unseen tasks at test time. Code: https://github.com/Jiahao000/VICT.
Problem

Research questions and friction points this paper is trying to address.

Enhancing VICL model generalizability under distribution shifts
Adapting VICL models dynamically with single test samples
Improving performance on unseen domains and tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts VICL models with single test sample
Uses cycle consistency loss for prompt reconstruction
Improves generalizability to unseen domains
🔎 Similar Papers
No similar papers found.