VisioPath: Vision-Language Enhanced Model Predictive Control for Safe Autonomous Navigation in Mixed Traffic

📅 2025-07-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address safe navigation challenges for autonomous driving in mixed-traffic scenarios, this paper proposes a novel framework integrating Vision-Language Models (VLMs) with Model Predictive Control (MPC). Leveraging zero-shot semantic parsing of bird’s-eye-view video, the VLM enables real-time estimation of surrounding vehicles’ positions, dimensions, and velocities. These estimates inform an elliptical obstacle-avoidance potential field, which is embedded into an event-triggered MPC trajectory optimization scheme solved via differential dynamic programming with adaptive regularization. Key contributions include: (i) the first integration of VLMs into closed-loop MPC control, enabling annotation-free semantic perception and geometry-aware potential field modeling; and (ii) a formal safety verification layer for rigorous trajectory risk assessment. Evaluated in SUMO simulations, the method significantly improves obstacle avoidance success rate, trajectory smoothness, and dynamic adaptability—outperforming conventional MPC across multiple quantitative metrics.

Technology Category

Application Category

📝 Abstract
In this paper, we introduce VisioPath, a novel framework combining vision-language models (VLMs) with model predictive control (MPC) to enable safe autonomous driving in dynamic traffic environments. The proposed approach leverages a bird's-eye view video processing pipeline and zero-shot VLM capabilities to obtain structured information about surrounding vehicles, including their positions, dimensions, and velocities. Using this rich perception output, we construct elliptical collision-avoidance potential fields around other traffic participants, which are seamlessly integrated into a finite-horizon optimal control problem for trajectory planning. The resulting trajectory optimization is solved via differential dynamic programming with an adaptive regularization scheme and is embedded in an event-triggered MPC loop. To ensure collision-free motion, a safety verification layer is incorporated in the framework that provides an assessment of potential unsafe trajectories. Extensive simulations in Simulation of Urban Mobility (SUMO) demonstrate that VisioPath outperforms conventional MPC baselines across multiple metrics. By combining modern AI-driven perception with the rigorous foundation of optimal control, VisioPath represents a significant step forward in safe trajectory planning for complex traffic systems.
Problem

Research questions and friction points this paper is trying to address.

Enabling safe autonomous driving in dynamic traffic environments
Combining vision-language models with model predictive control
Ensuring collision-free motion via safety verification layer
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines vision-language models with model predictive control
Uses bird's-eye view processing for traffic information
Integrates collision-avoidance fields into trajectory optimization
🔎 Similar Papers
No similar papers found.
Shanting Wang
Shanting Wang
Cornell University
P
Panagiotis Typaldos
School of Civil and Environmental Engineering, Cornell University, Ithaca, NY 14850, USA
C
Chenjun Li
School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14850, USA
Andreas A. Malikopoulos
Andreas A. Malikopoulos
Professor, Cornell University
Decentralized controllearning-based controlcyber-physical systemsemerging mobility systems