MSITrack: A Challenging Benchmark for Multispectral Single Object Tracking

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing RGB-based single-object trackers exhibit insufficient robustness against occlusion, distractor interference, and complex backgrounds; meanwhile, multi-spectral tracking lacks large-scale, high-quality benchmark datasets. To address these limitations, we introduce MSITrack—the first large-scale multi-spectral single-object tracking dataset—comprising 300 videos, over 129k frames, 55 object categories, and 300 natural scenes. MSITrack uniquely incorporates challenging attributes including systematic distractor interference and extreme target-background similarity in texture and color. Data acquisition employs high-precision multi-spectral imaging (visible + infrared), followed by meticulous manual annotation and multi-stage verification to ensure quality and enable cross-spectral analysis. Extensive evaluation on state-of-the-art trackers demonstrates that multi-spectral methods trained on MSITrack significantly outperform their RGB-only counterparts, confirming the dataset’s critical value in advancing robust visual tracking.

Technology Category

Application Category

📝 Abstract
Visual object tracking in real-world scenarios presents numerous challenges including occlusion, interference from similar objects and complex backgrounds-all of which limit the effectiveness of RGB-based trackers. Multispectral imagery, which captures pixel-level spectral reflectance, enhances target discriminability. However, the availability of multispectral tracking datasets remains limited. To bridge this gap, we introduce MSITrack, the largest and most diverse multispectral single object tracking dataset to date. MSITrack offers the following key features: (i) More Challenging Attributes-including interference from similar objects and similarity in color and texture between targets and backgrounds in natural scenarios, along with a wide range of real-world tracking challenges; (ii) Richer and More Natural Scenes-spanning 55 object categories and 300 distinct natural scenes, MSITrack far exceeds the scope of existing benchmarks. Many of these scenes and categories are introduced to the multispectral tracking domain for the first time; (iii) Larger Scale-300 videos comprising over 129k frames of multispectral imagery. To ensure annotation precision, each frame has undergone meticulous processing, manual labeling and multi-stage verification. Extensive evaluations using representative trackers demonstrate that the multispectral data in MSITrack significantly improves performance over RGB-only baselines, highlighting its potential to drive future advancements in the field. The MSITrack dataset is publicly available at: https://github.com/Fengtao191/MSITrack.
Problem

Research questions and friction points this paper is trying to address.

Addresses limitations of RGB trackers with occlusion and interference
Introduces largest multispectral dataset for enhanced object tracking
Improves tracking performance using pixel-level spectral reflectance data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Largest multispectral tracking dataset with 300 videos
Manual annotation with multi-stage verification process
Multispectral data improves performance over RGB baselines
🔎 Similar Papers
No similar papers found.
T
Tao Feng
Beijing Institute of Technology
T
Tingfa Xu
Beijing Institute of Technology
Haolin Qin
Haolin Qin
Beijing Institute of Technology
T
Tianhao Li
Beijing Institute of Technology
S
Shuaihao Han
Beijing Institute of Technology
X
Xuyang Zou
Beijing Institute of Technology
Z
Zhan Lv
Beijing Institute of Technology
J
Jianan Li
Beijing Institute of Technology