🤖 AI Summary
Existing semantic-driven terrain cost maps for autonomous robot navigation struggle to adapt to newly introduced terrain preferences at deployment time. Method: This paper proposes a retraining-free, end-to-end terrain cost map generation framework that takes bird’s-eye-view (BEV) images and user preference text as inputs. It jointly models preference-conditioned generation and cross-domain representation learning to enable zero-shot terrain cost generalization in open-world settings. Contributions/Results: To our knowledge, this is the first approach enabling real-time, on-the-fly cost adaptation to previously unseen terrain categories during deployment—bypassing reliance on predefined semantic classes. It employs a staged training strategy integrating both real and synthetic data. Experiments demonstrate significant improvements over semantic mapping and state-of-the-art representation learning baselines under novel preference and novel terrain scenarios, achieving both strong runtime adaptability and robust zero-shot generalization.
📝 Abstract
In autonomous robot navigation, terrain cost assignment is typically performed using a semantics-based paradigm in which terrain is first labeled using a pre-trained semantic classifier and costs are then assigned according to a user-defined mapping between label and cost. While this approach is rapidly adaptable to changing user preferences, only preferences over the types of terrain that are already known by the semantic classifier can be expressed. In this letter, we hypothesize that a machine-learning-based alternative to the semantics-based paradigm above will allow for rapid cost assignment adaptation to preferences expressed over new terrains at deployment time without the need for additional training. To investigate this hypothesis, we introduce and study pacer, a novel approach to costmap generation that accepts as input a single birds-eye view (BEV) image of the surrounding area along with a user-specified preference context and generates a corresponding BEV costmap that aligns with the preference context. Using a staged training procedure leveraging real and synthetic data, we find that pacer is able to adapt to new user preferences at deployment time while also exhibiting better generalization to novel terrains compared to both semantics-based and representation-learning approaches.