Using Vision Language Models as Closed-Loop Symbolic Planners for Robotic Applications: A Control-Theoretic Perspective

📅 2025-11-10

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the poor robustness and high-cost failures of vision-language models (VLMs) when employed as black-box symbolic planners in closed-loop robotic tasks. We introduce, for the first time, a control-theoretic formulation of VLM-based dynamic planning, proposing a synergistic optimization framework that jointly leverages control horizon and warm-starting to explicitly embed symbolic planning within a feedback control loop. Comprehensive experiments demonstrate that our approach significantly improves planning success rate (+27.3%) and real-time performance (39% reduction in reasoning steps) on complex robotic tasks, while enhancing robustness against observation noise and actuation errors. The method establishes an interpretable, analyzable paradigm for reliably deploying VLMs in safety-critical, high-level robotic planning.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) and Vision Language Models (VLMs) have been widely used for embodied symbolic planning. Yet, how to effectively use these models for closed-loop symbolic planning remains largely unexplored. Because they operate as black boxes, LLMs and VLMs can produce unpredictable or costly errors, making their use in high-level robotic planning especially challenging. In this work, we investigate how to use VLMs as closed-loop symbolic planners for robotic applications from a control-theoretic perspective. Concretely, we study how the control horizon and warm-starting impact the performance of VLM symbolic planners. We design and conduct controlled experiments to gain insights that are broadly applicable to utilizing VLMs as closed-loop symbolic planners, and we discuss recommendations that can help improve the performance of VLM symbolic planners.

Problem

Research questions and friction points this paper is trying to address.

Using VLMs for closed-loop robotic symbolic planning

Addressing unpredictable errors in VLM-based planning systems

Investigating control parameters to optimize VLM planner performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using VLMs for closed-loop robotic symbolic planning

Studying control horizon and warm-starting impacts

Providing recommendations to enhance VLM planning performance

🔎 Similar Papers

A Survey of Language-Based Communication in Robotics