🤖 AI Summary
This work addresses the challenge of enabling high-speed drones to effectively perceive both static and dynamic obstacles in highly dynamic and complex environments. The authors propose an end-to-end flight control network that fuses event camera and depth camera data at the feature level through a bidirectional cross-attention mechanism, trained via imitation learning. They also introduce a novel spherical primary search (SPS) expert trajectory planner with linear computational complexity O(n), which significantly enhances trajectory smoothness and obstacle avoidance performance. Experimental results demonstrate that the system achieves a 70–80% success rate in obstacle avoidance at speeds up to 17 m/s, representing a 10–20% improvement over unimodal and unidirectionally fused approaches, while the SPS planner alone exceeds an 80% success rate.
📝 Abstract
Achieving safe, high-speed autonomous flight in complex environments with static, dynamic, or mixed obstacles remains challenging, as a single perception modality is incomplete. Depth cameras are effective for static objects but suffer from motion blur at high speeds. Conversely, event cameras excel at capturing rapid motion but struggle to perceive static scenes. To exploit the complementary strengths of both sensors, we propose an end-to-end flight control network that achieves feature-level fusion of depth images and event data through a bidirectional crossattention module. The end-to-end network is trained via imitation learning, which relies on high-quality supervision. Building on this insight, we design an efficient expert planner using Spherical Principal Search (SPS). This planner reduces computational complexity from $O(n^2)$ to $O(n)$ while generating smoother trajectories, achieving over 80% success rate at 17m/s--nearly 20% higher than traditional planners. Simulation experiments show that our method attains a 70-80% success rate at 17 m/s across varied scenes, surpassing single-modality and unidirectional fusion models by 10-20%. These results demonstrate that bidirectional fusion effectively integrates event and depth information, enabling more reliable obstacle avoidance in complex environments with both static and dynamic objects.