Domain-Enhanced Dual-Branch Model for Efficient and Interpretable Accident Anticipation

📅 2025-07-16

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

To address the low accuracy, high computational overhead, and poor interpretability of real-time traffic accident prediction in autonomous driving, this paper proposes a domain-enhanced dual-branch multimodal fusion model. The model separately processes driving videos (using Long-CLIP) and structured accident texts (semantically parsed via GPT-4o prompt engineering), enabling efficient cross-modal collaborative modeling through cross-modal feature alignment and a lightweight fusion mechanism. Innovatively, it incorporates traffic-domain knowledge-constrained prompt templates and a hierarchical attention aggregation strategy, significantly reducing inference latency while preserving model interpretability. Evaluated on three major benchmarks—DAD, CCD, and A3D—the method achieves state-of-the-art performance with fewer parameters, improving average accuracy by 4.2% and F1-score by 5.7%. This work establishes a new paradigm for real-time, trustworthy accident early warning systems.

Technology Category

Application Category

📝 Abstract

Developing precise and computationally efficient traffic accident anticipation system is crucial for contemporary autonomous driving technologies, enabling timely intervention and loss prevention. In this paper, we propose an accident anticipation framework employing a dual-branch architecture that effectively integrates visual information from dashcam videos with structured textual data derived from accident reports. Furthermore, we introduce a feature aggregation method that facilitates seamless integration of multimodal inputs through large models (GPT-4o, Long-CLIP), complemented by targeted prompt engineering strategies to produce actionable feedback and standardized accident archives. Comprehensive evaluations conducted on benchmark datasets (DAD, CCD, and A3D) validate the superior predictive accuracy, enhanced responsiveness, reduced computational overhead, and improved interpretability of our approach, thus establishing a new benchmark for state-of-the-art performance in traffic accident anticipation.

Problem

Research questions and friction points this paper is trying to address.

Develop efficient accident anticipation for autonomous driving

Integrate visual and textual data for improved prediction

Enhance interpretability and reduce computational overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-branch architecture integrates visual and textual data

Feature aggregation method uses GPT-4o and Long-CLIP

Prompt engineering enhances feedback and archives

🔎 Similar Papers

Graph Neural Networks for Road Safety Modeling: Datasets and Evaluations for Accident Analysis