🤖 AI Summary
Traditional deterministic models struggle to capture environmental uncertainty and interactive causal relationships in autonomous driving scene understanding. To address this, we propose an uncertainty-aware interactive perception framework integrating LiDAR-based 3D object detection with multi-view RGB inputs. Our key contributions are threefold: (1) the first coupling of Bayesian Graph Neural Networks (BGNNs) with Chain-of-Thought (CoT) reasoning to enable risk-driven, interpretable interaction inference; (2) a predictive CoT mechanism that jointly infers future agent intentions and collision risks; and (3) Grad-CAM-based visualization coupled with dynamic spatiotemporal context modeling to enhance both interpretability and generalization. Evaluated on the DriveCoT benchmark, our method achieves state-of-the-art performance in interaction accuracy, uncertainty calibration, and decision interpretability.
📝 Abstract
Driving scene understanding is a critical real-world problem that involves interpreting and associating various elements of a driving environment, such as vehicles, pedestrians, and traffic signals. Despite advancements in autonomous driving, traditional pipelines rely on deterministic models that fail to capture the probabilistic nature and inherent uncertainty of real-world driving. To address this, we propose PRIMEDrive-CoT, a novel uncertainty-aware model for object interaction and Chain-of-Thought (CoT) reasoning in driving scenarios. In particular, our approach combines LiDAR-based 3D object detection with multi-view RGB references to ensure interpretable and reliable scene understanding. Uncertainty and risk assessment, along with object interactions, are modelled using Bayesian Graph Neural Networks (BGNNs) for probabilistic reasoning under ambiguous conditions. Interpretable decisions are facilitated through CoT reasoning, leveraging object dynamics and contextual cues, while Grad-CAM visualizations highlight attention regions. Extensive evaluations on the DriveCoT dataset demonstrate that PRIMEDrive-CoT outperforms state-of-the-art CoT and risk-aware models.