🤖 AI Summary
State Space Models (SSMs) offer computational efficiency over Vision Transformers (ViTs) but suffer from degraded generalization under distribution shifts. To address this, we propose TRUST—the first test-time adaptation (TTA) method specifically designed for SSMs. Leveraging the inherent sequential modeling capability of SSMs, TRUST introduces an uncertainty-driven multi-traversal strategy to generate causally diverse view representations. It integrates pseudo-label optimization with cross-traversal parameter averaging, enabling source-free knowledge integration and robust model updating. Crucially, TRUST is the first to explicitly model traversal order in SSM-based TTA, thereby capturing cross-scan causal structures. Extensive experiments across seven benchmarks demonstrate that TRUST consistently outperforms existing TTA methods, achieving significant gains in out-of-distribution robustness while maintaining computational efficiency.
📝 Abstract
State Space Models (SSMs) have emerged as efficient alternatives to Vision Transformers (ViTs), with VMamba standing out as a pioneering architecture designed for vision tasks. However, their generalization performance degrades significantly under distribution shifts. To address this limitation, we propose TRUST (Test-Time Refinement using Uncertainty-Guided SSM Traverses), a novel test-time adaptation (TTA) method that leverages diverse traversal permutations to generate multiple causal perspectives of the input image. Model predictions serve as pseudo-labels to guide updates of the Mamba-specific parameters, and the adapted weights are averaged to integrate the learned information across traversal scans. Altogether, TRUST is the first approach that explicitly leverages the unique architectural properties of SSMs for adaptation. Experiments on seven benchmarks show that TRUST consistently improves robustness and outperforms existing TTA methods.