🤖 AI Summary
This work addresses the challenge in infrared small target detection where targets are difficult to distinguish from high-frequency background clutter, such as bright corners and fragmented clouds, due to the lack of explicit high-frequency modeling in existing deep learning approaches. To this end, we propose Dynamic High-Frequency Convolution (DHiF), which leverages frequency-domain characteristics to generate dynamic local filter banks that adaptively capture grayscale variations in high-frequency regions, thereby effectively differentiating targets from clutter. DHiF reformulates discriminative high-frequency modeling into a zero-centered symmetric dynamic filtering mechanism that seamlessly integrates into standard convolutional networks. Extensive experiments on multiple real-world SIRST datasets demonstrate that our method consistently outperforms state-of-the-art approaches across various backbone architectures while maintaining manageable computational overhead.
📝 Abstract
Infrared small targets are typically tiny and locally salient, which belong to high-frequency components (HFCs) in images. Single-frame infrared small target (SIRST) detection is challenging, since there are many HFCs along with targets, such as bright corners, broken clouds, and other clutters. Current learning-based methods rely on the powerful capabilities of deep networks, but neglect explicit modeling and discriminative representation learning of various HFCs, which is important to distinguish targets from other HFCs. To address the aforementioned issues, we propose a dynamic high-frequency convolution (DHiF) to translate the discriminative modeling process into the generation of a dynamic local filter bank. Especially, DHiF is sensitive to HFCs, owing to the dynamic parameters of its generated filters being symmetrically adjusted within a zero-centered range according to Fourier transformation properties. Combining with standard convolution operations, DHiF can adaptively and dynamically process different HFC regions and capture their distinctive grayscale variation characteristics for discriminative representation learning. DHiF functions as a drop-in replacement for standard convolution and can be used in arbitrary SIRST detection networks without significant decrease in computational efficiency. To validate the effectiveness of our DHiF, we conducted extensive experiments across different SIRST detection networks on real-scene datasets. Compared to other state-of-the-art convolution operations, DHiF exhibits superior detection performance with promising improvement. Codes are available at https://github.com/TinaLRJ/DHiF.