🤖 AI Summary
To address safe navigation challenges for autonomous driving in mixed-traffic scenarios, this paper proposes a novel framework integrating Vision-Language Models (VLMs) with Model Predictive Control (MPC). Leveraging zero-shot semantic parsing of bird’s-eye-view video, the VLM enables real-time estimation of surrounding vehicles’ positions, dimensions, and velocities. These estimates inform an elliptical obstacle-avoidance potential field, which is embedded into an event-triggered MPC trajectory optimization scheme solved via differential dynamic programming with adaptive regularization. Key contributions include: (i) the first integration of VLMs into closed-loop MPC control, enabling annotation-free semantic perception and geometry-aware potential field modeling; and (ii) a formal safety verification layer for rigorous trajectory risk assessment. Evaluated in SUMO simulations, the method significantly improves obstacle avoidance success rate, trajectory smoothness, and dynamic adaptability—outperforming conventional MPC across multiple quantitative metrics.
📝 Abstract
In this paper, we introduce VisioPath, a novel framework combining vision-language models (VLMs) with model predictive control (MPC) to enable safe autonomous driving in dynamic traffic environments. The proposed approach leverages a bird's-eye view video processing pipeline and zero-shot VLM capabilities to obtain structured information about surrounding vehicles, including their positions, dimensions, and velocities. Using this rich perception output, we construct elliptical collision-avoidance potential fields around other traffic participants, which are seamlessly integrated into a finite-horizon optimal control problem for trajectory planning. The resulting trajectory optimization is solved via differential dynamic programming with an adaptive regularization scheme and is embedded in an event-triggered MPC loop. To ensure collision-free motion, a safety verification layer is incorporated in the framework that provides an assessment of potential unsafe trajectories. Extensive simulations in Simulation of Urban Mobility (SUMO) demonstrate that VisioPath outperforms conventional MPC baselines across multiple metrics. By combining modern AI-driven perception with the rigorous foundation of optimal control, VisioPath represents a significant step forward in safe trajectory planning for complex traffic systems.