🤖 AI Summary
This work proposes a semantic navigation approach for autonomous robots operating in unknown environments, addressing the common oversight of semantic cues such as signs and room numbers that can significantly improve navigation efficiency. The method integrates local perception, frontier-based exploration, and a large language model (LLM) to dynamically interpret environmental text, infer symbolic patterns (e.g., room numbering conventions), and construct a confidence grid that guides exploration in real time. Notably, this is the first framework to employ an LLM for on-the-fly parsing of environmental text to drive forward-looking goal inference. Evaluated on realistic floorplans, the approach achieves a weighted path success rate over 25% higher than baseline methods, approaching the performance of optimal paths.
📝 Abstract
Autonomous navigation in unfamiliar environments often relies on geometric mapping and planning strategies that overlook rich semantic cues such as signs, room numbers, and textual labels. We propose a novel semantic navigation framework that leverages large language models (LLMs) to infer patterns from partial observations and predict regions where the goal is most likely located. Our method combines local perceptual inputs with frontier-based exploration and periodic LLM queries, which extract symbolic patterns (e.g., room numbering schemes and building layout structures) and update a confidence grid used to guide exploration. This enables robots to move efficiently toward goal locations labeled with textual identifiers (e.g.,"room 8") even before direct observation. We demonstrate that this approach enables more efficient navigation in sparse, partially observable grid environments by exploiting symbolic patterns. Experiments across environments modeled after real floor plans show that our approach consistently achieves near-optimal paths and outperforms baselines by over 25% in Success weighted by Path Length.