Decoupling Ego-Motion from Target Dynamics via Dual-Interval Motion Cues for UAV Detection

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

211K/year
🤖 AI Summary
This work addresses the performance degradation of small object detection in drone-captured videos under dynamic scenes, primarily caused by ego-motion, camera jitter, and scale variations. To mitigate these challenges, the authors propose a purely vision-based motion-guided detection framework. The approach first aligns consecutive frames via homography-based global motion compensation and then introduces a dual-interval (short-term and long-term) motion cue extraction strategy to disentangle object motion from camera-induced perturbations. A lightweight motion-guided attention module is further incorporated to enhance feature representation. Integrated into YOLOv8 with a feature pyramid network, the method achieves significant performance gains over baseline models on the VisDrone-VID dataset, demonstrating particularly robust improvements under strong ego-motion conditions. Ablation studies confirm the effectiveness of each proposed component.
📝 Abstract
Object detection from Unmanned Aerial Vehicles (UAVs) is challenged by severe ego-motion, camera jitter, and large scale variations. While modern detectors perform well on static images, their direct application to UAV video often fails, particularly for small objects in dynamic scenes. Existing motion-based methods either rely on computationally expensive optical flow or use single-interval differencing, which is sensitive to jitter and limited in capturing diverse motion patterns. We propose a vision-only motion-guided detection framework that decouples target motion from camera-induced disturbances. A homography-based Global Motion Compensation (GMC) first aligns adjacent frames. We then introduce a Dual-Interval Motion Extraction strategy that captures both short-term and long-term motion cues. To integrate these cues, a lightweight Motion-Guided Attention (MGA) module enhances feature representations within a Feature Pyramid Network. Experiments on the VisDrone-VID dataset demonstrate consistent improvements over a strong YOLOv8 baseline under severe ego-motion. Ablation studies further confirm the effectiveness of the dual-interval design and the proposed motion-guided attention mechanism.
Problem

Research questions and friction points this paper is trying to address.

UAV detection
ego-motion
camera jitter
motion cues
small object detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-Interval Motion Extraction
Global Motion Compensation
Motion-Guided Attention
Ego-Motion Decoupling
UAV Object Detection
🔎 Similar Papers