🤖 AI Summary
To address high communication bandwidth, poor interpretability, and neglect of perception/planning uncertainty in multi-vehicle cooperative autonomous driving, this paper proposes a vision-language model (VLM)-based uncertainty-aware cooperative planning framework. Methodologically, we design a two-stage natural language communication protocol: (1) key interacting vehicles are selected via mutual information maximization; (2) a VLM generates and parses lightweight textual messages quantifying perception and planning uncertainties, integrated with selective information fusion and uncertainty-aware planning. Our key contribution is the first explicit modeling and incorporation of perception uncertainty into natural language communication for cooperative driving. Experiments across diverse driving scenarios demonstrate that the framework reduces communication bandwidth by 63%, improves safety scores by 31%, decreases decision uncertainty by 61%, and quadruples the near-collision avoidance distance margin.
📝 Abstract
Safe large-scale coordination of multiple cooperative connected autonomous vehicles (CAVs) hinges on communication that is both efficient and interpretable. Existing approaches either rely on transmitting high-bandwidth raw sensor data streams or neglect perception and planning uncertainties inherent in shared data, resulting in systems that are neither scalable nor safe. To address these limitations, we propose Uncertainty-Guided Natural Language Cooperative Autonomous Planning (UNCAP), a vision-language model-based planning approach that enables CAVs to communicate via lightweight natural language messages while explicitly accounting for perception uncertainty in decision-making. UNCAP features a two-stage communication protocol: (i) an ego CAV first identifies the subset of vehicles most relevant for information exchange, and (ii) the selected CAVs then transmit messages that quantitatively express their perception uncertainty. By selectively fusing messages that maximize mutual information, this strategy allows the ego vehicle to integrate only the most relevant signals into its decision-making, improving both the scalability and reliability of cooperative planning. Experiments across diverse driving scenarios show a 63% reduction in communication bandwidth with a 31% increase in driving safety score, a 61% reduction in decision uncertainty, and a four-fold increase in collision distance margin during near-miss events. Project website: https://uncap-project.github.io/