FMC-DETR: Frequency-Decoupled Multi-Domain Coordination for Aerial-View Object Detection

📅 2025-09-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address performance bottlenecks in detecting tiny objects in high-resolution aerial imagery—stemming from weak global contextual awareness in shallow features and loss of multi-scale details—this paper proposes a frequency-decoupled, multi-domain collaborative detection framework. Methodologically, it introduces (1) the Wavelet Kolmogorov-Arnold Transformer (WKAT), a novel backbone integrating wavelet-based multi-scale decomposition with Kolmogorov-Arnold nonlinear representation learning; and (2) a cross-stage partial fusion module coupled with a unified spatial-frequency-structural coordination mechanism, enabling dynamic balance between low-frequency semantic enhancement and high-frequency detail preservation. Evaluated on the VisDrone dataset, the method achieves state-of-the-art performance under parameter-constrained settings: +6.5% AP and +8.2% AP₅₀, while employing fewer parameters than competing approaches.

Technology Category

Application Category

📝 Abstract
Aerial-view object detection is a critical technology for real-world applications such as natural resource monitoring, traffic management, and UAV-based search and rescue. Detecting tiny objects in high-resolution aerial imagery presents a long-standing challenge due to their limited visual cues and the difficulty of modeling global context in complex scenes. Existing methods are often hampered by delayed contextual fusion and inadequate non-linear modeling, failing to effectively use global information to refine shallow features and thus encountering a performance bottleneck. To address these challenges, we propose FMC-DETR, a novel framework with frequency-decoupled fusion for aerial-view object detection. First, we introduce the Wavelet Kolmogorov-Arnold Transformer (WeKat) backbone, which applies cascaded wavelet transforms to enhance global low-frequency context perception in shallow features while preserving fine-grained details, and employs Kolmogorov-Arnold networks to achieve adaptive non-linear modeling of multi-scale dependencies. Next, a lightweight Cross-stage Partial Fusion (CPF) module reduces redundancy and improves multi-scale feature interaction. Finally, we introduce the Multi-Domain Feature Coordination (MDFC) module, which unifies spatial, frequency, and structural priors to to balance detail preservation and global enhancement. Extensive experiments on benchmark aerial-view datasets demonstrate that FMC-DETR achieves state-of-the-art performance with fewer parameters. On the challenging VisDrone dataset, our model achieves improvements of 6.5% AP and 8.2% AP50 over the baseline, highlighting its effectiveness in tiny object detection. The code can be accessed at https://github.com/bloomingvision/FMC-DETR.
Problem

Research questions and friction points this paper is trying to address.

Detecting tiny objects in high-resolution aerial imagery
Addressing delayed contextual fusion in object detection
Improving global information usage for shallow feature refinement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Wavelet Kolmogorov-Arnold Transformer enhances global context perception
Cross-stage Partial Fusion module improves multi-scale feature interaction
Multi-Domain Feature Coordination unifies spatial, frequency, and structural priors
🔎 Similar Papers
No similar papers found.
Ben Liang
Ben Liang
Department of Electrical and Computer Engineering, University of Toronto
Networked SystemsWireless CommunicationsMobile ComputingMobility Management
Y
Yuan Liu
School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing 210094, China
B
Bingwen Qiu
School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing 210094, China
Y
Yihong Wang
School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing 210094, China
Xiubao Sui
Xiubao Sui
Nanjing University of Science and Technology
infrared colorization
Q
Qian Chen
School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing 210094, China