Using Vision Language Models as Closed-Loop Symbolic Planners for Robotic Applications: A Control-Theoretic Perspective

📅 2025-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the poor robustness and high-cost failures of vision-language models (VLMs) when employed as black-box symbolic planners in closed-loop robotic tasks. We introduce, for the first time, a control-theoretic formulation of VLM-based dynamic planning, proposing a synergistic optimization framework that jointly leverages control horizon and warm-starting to explicitly embed symbolic planning within a feedback control loop. Comprehensive experiments demonstrate that our approach significantly improves planning success rate (+27.3%) and real-time performance (39% reduction in reasoning steps) on complex robotic tasks, while enhancing robustness against observation noise and actuation errors. The method establishes an interpretable, analyzable paradigm for reliably deploying VLMs in safety-critical, high-level robotic planning.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) and Vision Language Models (VLMs) have been widely used for embodied symbolic planning. Yet, how to effectively use these models for closed-loop symbolic planning remains largely unexplored. Because they operate as black boxes, LLMs and VLMs can produce unpredictable or costly errors, making their use in high-level robotic planning especially challenging. In this work, we investigate how to use VLMs as closed-loop symbolic planners for robotic applications from a control-theoretic perspective. Concretely, we study how the control horizon and warm-starting impact the performance of VLM symbolic planners. We design and conduct controlled experiments to gain insights that are broadly applicable to utilizing VLMs as closed-loop symbolic planners, and we discuss recommendations that can help improve the performance of VLM symbolic planners.
Problem

Research questions and friction points this paper is trying to address.

Using VLMs for closed-loop robotic symbolic planning
Addressing unpredictable errors in VLM-based planning systems
Investigating control parameters to optimize VLM planner performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using VLMs for closed-loop robotic symbolic planning
Studying control horizon and warm-starting impacts
Providing recommendations to enhance VLM planning performance
🔎 Similar Papers
No similar papers found.
H
Hao Wang
Department of Electrical and Computer Engineering, University of Southern California, CA, USA Department of Aeronautics & Astronautics, Stanford University, CA, USA
S
Sathwik Karnik
Department of Aeronautics & Astronautics, Stanford University, CA, USA
B
Bea Lim
Department of Mechanical Engineering, Stanford University, CA, USA
Somil Bansal
Somil Bansal
Assistant Professor, Stanford University
RoboticsArtificial intelligenceDynamic systems and control