🤖 AI Summary
In multi-task test-time training (TTT), domain shift induces asynchronous task behaviors—i.e., disparate optimal adaptation steps and conflicting gradient directions across tasks—leading to gradient interference and performance degradation. To address this, we propose Synchronized Multi-Task Test-Time Training (S4T), the first method to introduce a task-relational prediction module that models cross-domain semantic correlations among tasks via self-supervised auxiliary objectives. This module dynamically aligns optimization directions across tasks, enabling cooperative adaptation without target-domain labels and maintaining compatibility with mainstream multi-task architectures. Evaluated on standard benchmarks including PASCAL-Context and NYUDv2, S4T consistently outperforms existing TTT approaches under unknown target domains, achieving average mIoU gains of 2.1–3.8 percentage points. These results empirically validate that task synchronization fundamentally enhances generalization in test-time adaptation.
📝 Abstract
Generalizing neural networks to unseen target domains is a significant challenge in real-world deployments. Test-time training (TTT) addresses this by using an auxiliary self-supervised task to reduce the domain gap caused by distribution shifts between the source and target. However, we find that when models are required to perform multiple tasks under domain shifts, conventional TTT methods suffer from unsynchronized task behavior, where the adaptation steps needed for optimal performance in one task may not align with the requirements of other tasks. To address this, we propose a novel TTT approach called Synchronizing Tasks for Test-time Training (S4T), which enables the concurrent handling of multiple tasks. The core idea behind S4T is that predicting task relations across domain shifts is key to synchronizing tasks during test time. To validate our approach, we apply S4T to conventional multi-task benchmarks, integrating it with traditional TTT protocols. Our empirical results show that S4T outperforms state-of-the-art TTT methods across various benchmarks.