🤖 AI Summary
To address high latency and fine-grained perception challenges in autonomous UAV obstacle avoidance within complex environments, this paper proposes an onboard 3D LiDAR-driven end-to-end reinforcement learning control framework. Methodologically, we design a lightweight point-cloud surrogate representation that balances fine-grained perception of narrow passages and thin obstacles (e.g., wires) with policy training efficiency; employ a perception encoder fusing voxelization and projection-based encoding; and train the policy using the Proximal Policy Optimization (PPO) algorithm in a lightweight physics simulator, followed by sim-to-real transfer. Contributions include: (i) the first pure-LiDAR-driven end-to-end low-level control paradigm, directly outputting 50-Hz motor-level control commands; (ii) >92% success rate in simulation across multi-speed constrained complex maneuvers; and (iii) real-world validation demonstrating fully autonomous indoor flight with reliable avoidance of thin wires and randomly distributed obstacles.
📝 Abstract
A long-cherished vision of drones is to autonomously traverse through clutter to reach every corner of the world using onboard sensing and computation. In this paper, we combine onboard 3D lidar sensing and sim-to-real reinforcement learning (RL) to enable autonomous flight in cluttered environments. Compared to vision sensors, lidars appear to be more straightforward and accurate for geometric modeling of surroundings, which is one of the most important cues for successful obstacle avoidance. On the other hand, sim-to-real RL approach facilitates the realization of low-latency control, without the hierarchy of trajectory generation and tracking. We demonstrate that, with design choices of practical significance, we can effectively combine the advantages of 3D lidar sensing and RL to control a quadrotor through a low-level control interface at 50Hz. The key to successfully learn the policy in a lightweight way lies in a specialized surrogate of the lidar's raw point clouds, which simplifies learning while retaining a fine-grained perception to detect narrow free space and thin obstacles. Simulation statistics demonstrate the advantages of the proposed system over alternatives, such as performing easier maneuvers and higher success rates at different speed constraints. With lightweight simulation techniques, the policy trained in the simulator can control a physical quadrotor, where the system can dodge thin obstacles and safely traverse randomly distributed obstacles.