🤖 AI Summary
Indoor autonomous navigation suffers from inadequate semantic representation, inflexible editing of semantic information, and frequent generation of physically infeasible paths during planning. Method: We propose the Semantically Enhanced Topological Map (SENT-Map), a lightweight JSON-based representation unifying human-readable and foundation-model-(FM-)parsable semantic knowledge, enabling natural-language-driven interactive editing. A node-anchoring mechanism constrains the planning space to ensure physical feasibility. SENT-Map integrates vision foundation models for environment perception and semantic mapping, and introduces a two-stage, natural-language-driven planning framework that enables efficient execution of complex tasks using small, localized FMs. Contribution/Results: Experiments demonstrate that SENT-Map significantly improves task success rates while maintaining high robustness and generalization under resource-constrained conditions, establishing a scalable semantic modeling paradigm for lightweight embodied intelligence.
📝 Abstract
We introduce SENT-Map, a semantically enhanced topological map for representing indoor environments, designed to support autonomous navigation and manipulation by leveraging advancements in foundational models (FMs). Through representing the environment in a JSON text format, we enable semantic information to be added and edited in a format that both humans and FMs understand, while grounding the robot to existing nodes during planning to avoid infeasible states during deployment. Our proposed framework employs a two stage approach, first mapping the environment alongside an operator with a Vision-FM, then using the SENT-Map representation alongside a natural-language query within an FM for planning. Our experimental results show that semantic-enhancement enables even small locally-deployable FMs to successfully plan over indoor environments.