Bidirectional Temporal Information Propagation for Moving Infrared Small Target Detection

📅 2025-08-21

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Existing sliding-window-based multi-frame infrared small target detection methods neglect global temporal dependencies, leading to information loss, computational redundancy, and performance degradation. To address this, we propose a bidirectional temporal information propagation framework that recursively fuses local and global spatiotemporal features—first of its kind. Specifically, we design a Local Temporal Motion Fusion (LTMF) module to model short-term dynamics and a Global Temporal Motion Fusion (GTMF) module to capture long-range temporal dependencies. Furthermore, we introduce a Spatiotemporal Fusion Loss (STF Loss) to enable end-to-end joint optimization. Our approach eliminates reliance on fixed sliding windows, significantly improving detection accuracy and robustness for weak and small targets. Extensive experiments on multiple infrared video benchmarks demonstrate state-of-the-art performance while maintaining efficient inference speed.

Technology Category

Application Category

📝 Abstract

Moving infrared small target detection is broadly adopted in infrared search and track systems, and has attracted considerable research focus in recent years. The existing learning-based multi-frame methods mainly aggregate the information of adjacent frames in a sliding window fashion to assist the detection of the current frame. However, the sliding-window-based methods do not consider joint optimization of the entire video clip and ignore the global temporal information outside the sliding window, resulting in redundant computation and sub-optimal performance. In this paper, we propose a Bidirectional temporal information propagation method for moving InfraRed small target Detection, dubbed BIRD. The bidirectional propagation strategy simultaneously utilizes local temporal information of adjacent frames and global temporal information of past and future frames in a recursive fashion. Specifically, in the forward and backward propagation branches, we first design a Local Temporal Motion Fusion (LTMF) module to model local spatio-temporal dependency between a target frame and its two adjacent frames. Then, a Global Temporal Motion Fusion (GTMF) module is developed to further aggregate the global propagation feature with the local fusion feature. Finally, the bidirectional aggregated features are fused and input into the detection head for detection. In addition, the entire video clip is jointly optimized by the traditional detection loss and the additional Spatio-Temporal Fusion (STF) loss. Extensive experiments demonstrate that the proposed BIRD method not only achieves the state-of-the-art performance but also shows a fast inference speed.

Problem

Research questions and friction points this paper is trying to address.

Detects moving infrared small targets in videos

Overcomes limitations of sliding-window temporal processing

Integrates local and global bidirectional temporal information

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bidirectional temporal propagation for global information

Local and global motion fusion modules

Joint optimization with spatio-temporal loss

🔎 Similar Papers

Infrared Small Target Detection based on Adjustable Sensitivity Strategy and Multi-Scale Fusion