LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People

📅 2026-04-26
📈 Citations: 0
Influential: 0
📄 PDF

career value

205K/year
🤖 AI Summary
This work addresses the lack of affordable indoor navigation solutions for blind and low-vision individuals by proposing a multi-agent navigation framework that operates without specialized infrastructure. Requiring only a single floorplan image, the method iteratively constructs a spatial knowledge graph through self-correcting parsing and integrates path planning with a safety assessment module to generate highly safe and accessible navigation instructions. The system employs a lightweight architecture that synergistically combines large language models with multi-agent collaboration. Evaluated in real-world environments at UMBC and on the CVC-FP benchmark, it consistently outperforms the strongest single-call LLM baseline across short, medium, and long routes, demonstrating superior effectiveness, robustness, and scalability.

Technology Category

Application Category

📝 Abstract
Indoor navigation remains a critical accessibility challenge for the blind and low-vision (BLV) individuals, as existing solutions rely on costly per-building infrastructure. We present an agentic framework that converts a single floor plan image into a structured, retrievable knowledge base to generate safe, accessible navigation instructions with lightweight infrastructure. The system has two phases: a multi-agent module that parses the floor plan into a spatial knowledge graph through a self-correcting pipeline with iterative retry loops and corrective feedback; and a Path Planner that generates accessible navigation instructions, with a Safety Evaluator agent assessing potential hazards along each route. We evaluate the system on the real-world UMBC Math and Psychology building (floors MP-1 and MP-3) and on the CVC-FP benchmark. On MP-1, we achieve success rates of 92.31%, 76.92%, and 61.54% for short, medium, and long routes, outperforming the strongest single-call baseline (Claude 3.7 Sonnet) at 84.62%, 69.23%, and 53.85%. On MP-3, we reach 76.92%, 61.54%, and 38.46%, compared to the best baseline at 61.54%, 46.15%, and 23.08%. These results show consistent gains over single-call LLM baselines and demonstrate that our workflow is a scalable solution for accessible indoor navigation for BLV individuals.
Problem

Research questions and friction points this paper is trying to address.

accessible indoor navigation
blind and low-vision
floor plan parsing
spatial knowledge graph
navigation instructions
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided agentic parsing
accessible indoor navigation
spatial knowledge graph
self-correcting multi-agent system
blind and low-vision accessibility
🔎 Similar Papers