LER-YOLO: Reliability-Aware Expert Routing for Misaligned RGB-Infrared UAV Detection

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

213K/year
🤖 AI Summary
This work addresses the challenge of drone detection in misaligned RGB and infrared remote sensing imagery, where spatial misregistration, small target sizes, and complex backgrounds degrade performance. To tackle this, the authors propose LER-YOLO, a novel framework featuring an uncertainty-aware object alignment module that generates spatial reliability maps. These maps guide a sparse Mixture-of-Experts (MoE) fusion module to adaptively select among RGB-dominant, infrared-dominant, or interactive fusion pathways, enabling reliability-driven cross-modal feature integration. Without increasing model capacity, the method achieves 89.7±0.2% AP50 on the MBU benchmark, with a peak performance of 89.9%, substantially outperforming existing approaches in misaligned multimodal scenarios and demonstrating the efficacy of reliability-guided routing for cross-modal fusion.
📝 Abstract
Detecting small unmanned aerial vehicles from RGB-infrared remote-sensing pairs remains challenging due to tiny target scale, cluttered backgrounds, and spatial misalignment between heterogeneous sensors. Existing bimodal detectors often align or fuse features without assessing the reliability of local cross-sensor correspondence, allowing mismatch artifacts to propagate into the detection head. To address this issue, we propose LER-YOLO, a reliability-aware sparse mixture-of-experts framework for misaligned RGB-infrared UAV detection. LER-YOLO first introduces an Uncertainty-Aware Target Alignment module that resamples visible features toward the infrared reference and estimates a spatial reliability map. This reliability prior is then used by a Reliability-Guided Sparse MoE Fusion module to adaptively select k experts from RGB-dominant, infrared-dominant, and interactive fusion experts, enabling trustworthy cross-modal interaction while suppressing unreliable fusion. Experiments on the public MBU benchmark under a YOLOv5s-family protocol show that LER-YOLO achieves 89.7+/-0.2% AP50 over three independent seeds, with a best result of 89.9%. Extensive ablations, parameter-matched comparisons, synthetic-shift evaluations, and complexity analysis demonstrate that the gains mainly come from reliability-guided expert routing rather than increased model capacity.
Problem

Research questions and friction points this paper is trying to address.

UAV detection
RGB-infrared misalignment
cross-modal fusion
reliability assessment
small object detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

reliability-aware routing
sparse mixture-of-experts
RGB-infrared misalignment
uncertainty-aware alignment
cross-modal fusion
🔎 Similar Papers
No similar papers found.
L
Liming Hou
Engineering University of PAP, Xi’an 710086, China
Y
Yueping Peng
Engineering University of PAP, Xi’an 710086, China
H
Hexiang Hao
Engineering University of PAP, Xi’an 710086, China
Ji Wang
Ji Wang
Central China Normal University
Wireless Communication
X
Xuekai Zhang
Engineering University of PAP, Xi’an 710086, China
Wei Tang
Wei Tang
Computer Engineering, Kyung Hee University
Thin Client ComputingMobile Cloud Computing
Z
Zecong Ye
Unit Command Department, Officers College of PAP, Chengdu 610213, China
X
Xin Ying
Engineering University of PAP, Xi’an 710086, China
Y
Yubo He
Engineering University of PAP, Xi’an 710086, China