Combating Partial Perception Deficit in Autonomous Driving with Multimodal LLM Commonsense

📅 2025-03-10

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

Autonomous driving systems often suffer from partial perception of critical objects due to sensor limitations, leading to unsafe or overly conservative control decisions. Method: This paper proposes LLM-RCO, the first framework to deeply integrate large language models (LLMs) into the closed-loop control pipeline, enabling coordinated hazard reasoning, short-horizon motion planning, action-conditioned verification, and safety constraint generation. We introduce DriveLM-Deficit—the first fine-grained video dataset specifically designed for perception-deficit scenarios—and integrate multimodal LLMs, video understanding, dynamic interactive reasoning, and CARLA simulation, augmented by an end-to-end action-conditioned verification mechanism. Contribution/Results: Evaluated under challenging conditions, LLM-RCO significantly improves traffic penetration rate and safety: emergency braking is reduced by 37.2%, while control policies become more proactive, regulation-compliant, and context-adaptive.

Technology Category

Application Category

📝 Abstract

Partial perception deficits can compromise autonomous vehicle safety by disrupting environmental understanding. Current protocols typically respond with immediate stops or minimal-risk maneuvers, worsening traffic flow and lacking flexibility for rare driving scenarios. In this paper, we propose LLM-RCO, a framework leveraging large language models to integrate human-like driving commonsense into autonomous systems facing perception deficits. LLM-RCO features four key modules: hazard inference, short-term motion planner, action condition verifier, and safety constraint generator. These modules interact with the dynamic driving environment, enabling proactive and context-aware control actions to override the original control policy of autonomous agents. To improve safety in such challenging conditions, we construct DriveLM-Deficit, a dataset of 53,895 video clips featuring deficits of safety-critical objects, complete with annotations for LLM-based hazard inference and motion planning fine-tuning. Extensive experiments in adverse driving conditions with the CARLA simulator demonstrate that systems equipped with LLM-RCO significantly improve driving performance, highlighting its potential for enhancing autonomous driving resilience against adverse perception deficits. Our results also show that LLMs fine-tuned with DriveLM-Deficit can enable more proactive movements instead of conservative stops in the context of perception deficits.

Problem

Research questions and friction points this paper is trying to address.

Address partial perception deficits in autonomous driving safety

Enhance flexibility in rare driving scenarios with LLM commonsense

Improve proactive control actions using multimodal LLM frameworks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages large language models for autonomous driving

Integrates human-like commonsense into vehicle systems

Uses DriveLM-Deficit dataset for hazard inference training

🔎 Similar Papers

No similar papers found.