🤖 AI Summary
This paper addresses key challenges in local navigation—poor cross-morphology generalization, reliance on morphology-specific data, tight coupling between planning and control, and “catastrophic averaging” in multimodal decision-making due to deterministic models. To this end, we propose CE-Nav, a two-stage decoupled cross-morphology navigation framework. Its core contributions are: (1) the first use of a conditional normalizing flow model, VelFlow, to explicitly model multimodal action distributions and eliminate averaging bias; and (2) decoupling geometric reasoning from dynamics adaptation—training a morphology-agnostic universal expert via large-scale imitation learning, then freezing it to guide a lightweight, dynamics-aware refiner optimized via online reinforcement learning, drastically reducing adaptation cost. CE-Nav achieves state-of-the-art performance on quadrupedal, bipedal, and quadrotor robots, enables rapid transfer with minimal real-world interaction, and has been successfully deployed on physical platforms, demonstrating both efficiency and scalability.
📝 Abstract
Generalizing local navigation policies across diverse robot morphologies is a critical challenge. Progress is often hindered by the need for costly and embodiment-specific data, the tight coupling of planning and control, and the "disastrous averaging" problem where deterministic models fail to capture multi-modal decisions (e.g., turning left or right). We introduce CE-Nav, a novel two-stage (IL-then-RL) framework that systematically decouples universal geometric reasoning from embodiment-specific dynamic adaptation. First, we train an embodiment-agnostic General Expert offline using imitation learning. This expert, a conditional normalizing flow model named VelFlow, learns the full distribution of kinematically-sound actions from a large-scale dataset generated by a classical planner, completely avoiding real robot data and resolving the multi-modality issue. Second, for a new robot, we freeze the expert and use it as a guiding prior to train a lightweight, Dynamics-Aware Refiner via online reinforcement learning. This refiner rapidly learns to compensate for the target robot's specific dynamics and controller imperfections with minimal environmental interaction. Extensive experiments on quadrupeds, bipeds, and quadrotors show that CE-Nav achieves state-of-the-art performance while drastically reducing adaptation cost. Successful real-world deployments further validate our approach as an efficient and scalable solution for building generalizable navigation systems.