🤖 AI Summary
Existing visual navigation policies suffer from poor generalization and overfitting under geometric trajectory variations. To address this, we propose FalconGym 2.0—a novel simulation framework—and a performance-guided fine-grained training methodology. FalconGym 2.0 introduces the first programmable editable Gaussian splatting environment, integrating optical flow–aware rendering with pose-graph refinement (PGR) optimization. We further design an iterative reinforcement learning scheme that adaptively focuses training on task difficulty, augmented by simulation-to-real transfer techniques. Experiments demonstrate that a single learned policy achieves 100% zero-shot success across three unseen tracks. The policy exhibits strong robustness under pose perturbations and, when deployed on a physical robot, successfully navigates 69 out of 70 door markers (98.6% success rate). Our approach significantly enhances navigation generalization and disturbance resilience in complex, dynamic environments.
📝 Abstract
Visual policy design is crucial for aerial navigation. However, state-of-the-art visual policies often overfit to a single track and their performance degrades when track geometry changes. We develop FalconGym 2.0, a photorealistic simulation framework built on Gaussian Splatting (GSplat) with an Edit API that programmatically generates diverse static and dynamic tracks in milliseconds. Leveraging FalconGym 2.0's editability, we propose a Performance-Guided Refinement (PGR) algorithm, which concentrates visual policy's training on challenging tracks while iteratively improving its performance. Across two case studies (fixed-wing UAVs and quadrotors) with distinct dynamics and environments, we show that a single visual policy trained with PGR in FalconGym 2.0 outperforms state-of-the-art baselines in generalization and robustness: it generalizes to three unseen tracks with 100% success without per-track retraining and maintains higher success rates under gate-pose perturbations. Finally, we demonstrate that the visual policy trained with PGR in FalconGym 2.0 can be zero-shot sim-to-real transferred to a quadrotor hardware, achieving a 98.6% success rate (69 / 70 gates) over 30 trials spanning two three-gate tracks and a moving-gate track.