🤖 AI Summary
This work addresses the vulnerability of multi-object tracking (MOT) models to appearance, motion, and category distribution shifts during inference. Existing test-time adaptation methods often neglect temporal consistency and identity association across frames and videos. To overcome this limitation, the authors propose the TCEI framework, which introduces— for the first time—the dual-process mechanism of human intuition and experience into test-time adaptation. The intuition system leverages short-term memory for rapid response to recent targets, while the experience system calibrates predictions using knowledge accumulated from historical test videos. By jointly considering high-confidence and uncertain samples, the framework enables online reflection and adaptive adjustment. This two-level calibration strategy effectively balances immediate responsiveness with long-term experience, significantly outperforming existing methods on multiple MOT benchmarks, mitigating performance degradation under distribution shift, and enhancing both temporal consistency and identity stability.
📝 Abstract
Multiple Object Tracking (MOT) has long been a fundamental task in computer vision, with broad applications in various real-world scenarios. However, due to distribution shifts in appearance, motion pattern, and catagory between the training and testing data, model performance degrades considerably during online inference in MOT. Test-Time Adaptation (TTA) has emerged as a promising paradigm to alleviate such distribution shifts. However, existing TTA methods often fail to deliver satisfactory results in MOT, as they primarily focus solely on frame-level adaptation while neglecting temporal consistency and identity association across frames and videos. Inspired by human decision-making process, this paper propose a Test-time Calibration from Experience and Intuition (TCEI) framework. In this framework, the Intuitive system utilizes transient memory to recall recently observed objects for rapid predictions, while the Experiential system leverages the accumulated experience from prior test videos to reassess and calibrate these intuitive predictions. Furthermore, both confident and uncertain objects during online testing are exploited as historical priors and reflective cases, respectively, enabling the model to adapt to the testing environment and alleviate performance degradation. Extensive experiments demonstrate that the proposed TCEI framework consistently achieves superior performance across multiple benchmark datasets and significantly enhances the model's adaptability under distribution shifts. The code will be released at https://github.com/1941Zpf/TCEI.