FwNet-ECA: Facilitating Window Attention with Global Receptive Fields through Fourier Filtering Operations

📅 2025-02-25
📈 Citations: 0
Influential: 0
📄 PDF

career value

247K/year
🤖 AI Summary
To address the limitations of windowed attention—namely, restricted receptive fields, fragmented inter-window information flow, and high computational overhead—this paper proposes Frequency-Enhanced Windowed Attention (FEWA). FEWA innovatively incorporates the Fourier transform into windowed attention design, employing a learnable spectral weighting matrix to implicitly model long-range dependencies in the frequency domain, thereby obviating explicit window shifting while achieving zero-parameter global receptive field expansion. Furthermore, FEWA integrates lightweight ECA-based channel attention to establish a joint frequency–spatial modeling mechanism. Evaluated on iCartoonFace and ImageNet, FEWA reduces parameter count and FLOPs by over 30% compared to Swin Transformer, without compromising accuracy. This work introduces a novel paradigm for efficient windowed attention, balancing expressiveness, efficiency, and scalability.

Technology Category

Application Category

📝 Abstract
Windowed attention mechanisms were introduced to mitigate the issue of excessive computation inherent in global attention mechanisms. However, In this paper, we present FwNet-ECA, a novel method that utilizes Fourier transforms paired with learnable weight matrices to enhance the spectral features of images. This strategy facilitates inter-window connectivity, thereby maximizing the receptive field. Additionally, we incorporate the Efficient Channel Attention (ECA) module to improve communication between different channels. Instead of relying on physically shifted windows, our approach leverages frequency domain enhancement to implicitly bridge information across spatial regions. We validate our model on the iCartoonFace dataset and conduct downstream tasks on ImageNet, demonstrating that our model achieves lower parameter counts and computational overheads compared to shifted window approaches, while maintaining competitive accuracy. This work offers a more efficient and effective alternative for leveraging attention mechanisms in visual processing tasks, alleviating the challenges associated with windowed attention models. Code is available at https://github.com/qingxiaoli/FwNet-ECA.
Problem

Research questions and friction points this paper is trying to address.

Enhance spectral features using Fourier transforms
Improve inter-window connectivity and receptive fields
Reduce computational overhead in attention mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fourier transforms enhance spectral features
Efficient Channel Attention improves channel communication
Frequency domain bridges spatial region information
🔎 Similar Papers
No similar papers found.
S
Shengtian Mian
School of Statistics, Capital University of Economics and Business, Beijing, 100070, China.
Y
Ya Wang
School of Mathematical Sciences, Peking University, Beijing, 100871, China.
N
Nannan Gu
School of Statistics, Capital University of Economics and Business, Beijing, 100070, China.
Y
Yuping Wang
School of Statistics, Capital University of Economics and Business, Beijing, 100070, China.
Xiaoqing Li
Xiaoqing Li
National University of Singapore
Biomedical Engineering