🤖 AI Summary
This work addresses the challenge of socially inappropriate navigation behaviors exhibited by mobile robots in human environments due to their disregard for local social norms. To this end, the authors propose NORM-Nav, a novel framework that, for the first time, enables zero-shot integration of natural language behavioral instructions into robotic navigation without fine-tuning. By leveraging a large language model to parse textual commands into structured constraints and fusing vision-LiDAR perception to generate multi-layer semantic cost maps, the approach directly embeds geometric, semantic, directional, and velocity constraints into a standard grid-based planner. Evaluated in both simulation and real-world settings, NORM-Nav significantly improves task success rates and produces trajectories that align more closely with human behavioral expectations, outperforming existing baselines.
📝 Abstract
Mobile robots operating in human-centered environments must generate not only collision-free paths but also trajectories that follow local behavioral conventions. Conventional costmap-based navigation emphasizes geometric feasibility and often overlooks such requirements, which can result in socially inappropriate behaviors. This paper presents NORM-Nav, a zero-shot framework that integrates natural language behavioral constraints into costmap-based planning. An LLM parses each instruction into structured constraints and grounds them using real-time vision--LiDAR perception. These constraints are encoded as multi-layer costmaps that represent geometric, semantic, directional, and velocity cues and are directly compatible with standard grid-based planners. Simulation and real-world experiments indicate that NORM-Nav improves task success rates and produces trajectories closer to human references than representative baselines. The project website is available at https://ei-nav.github.io/NORM-Nav.