🤖 AI Summary
This work addresses the high computational and memory costs of traditional A* algorithms in large-scale, complex environments, as well as the limited spatial awareness of existing large language model–based path planning methods, which often produce geometrically inconsistent waypoints in topologically intricate regions. To overcome these limitations, the study introduces, for the first time, the spatial grounding capability of vision–language models into path planning. It integrates multimodal perception with incremental heuristic search and proposes an adaptive decay mechanism to dynamically suppress the influence of uncertain waypoints. The resulting approach generates near-optimal, geometrically consistent trajectories in highly cluttered and topologically complex environments, significantly reducing computational and memory overhead while enhancing planning efficiency and robustness.
📝 Abstract
Autonomous path planning requires a synergy between global reasoning and geometric precision, especially in complex or cluttered environments. While classical A* is valued for its optimality, it incurs prohibitive computational and memory costs in large-scale scenarios. Recent attempts to mitigate these limitations by using Large Language Models for waypoint guidance remain insufficient, as they rely only on text-based reasoning without spatial grounding. As a result, such models often produce incorrect waypoints in topologically complex environments with dead ends, and lack the perceptual capacity to interpret ambiguous physical boundaries. These inconsistencies lead to costly corrective expansions and undermine the intended computational efficiency. We introduce MMP-A*, a multimodal framework that integrates the spatial grounding capabilities of vision-language models with a novel adaptive decay mechanism. By anchoring high-level reasoning in physical geometry, the framework produces coherent waypoint guidance that addresses the limitations of text-only planners. The adaptive decay mechanism dynamically regulates the influence of uncertain waypoints within the heuristic, ensuring geometric validity while substantially reducing memory overhead. To evaluate robustness, we test the framework in challenging environments characterized by severe clutter and topological complexity. Experimental results show that MMP-A* achieves near-optimal trajectories with significantly reduced operational costs, demonstrating its potential as a perception-grounded and computationally efficient paradigm for autonomous navigation.