Large Language Models for Pedestrian Safety: An Application to Predicting Driver Yielding Behavior at Unsignalized Intersections

📅 2025-09-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of predicting driver yielding behavior at unsignalized intersections to enhance pedestrian safety. To overcome difficulties in fine-grained modeling of interactive behaviors and achieving interpretable decision-making, we propose a novel multimodal prompting mechanism that integrates traffic-domain knowledge, structured reasoning chains, and few-shot prompting—thereby enhancing large language models’ (LLMs) understanding of dynamic traffic contexts. Leveraging GPT-4o and Deepseek-V3, our framework fuses vision-language inputs with domain-constrained prompts to enable context-aware behavioral inference. Experimental results show GPT-4o achieves the highest accuracy (89.2%) and recall (85.7%), while Deepseek-V3 attains the best precision (91.4%), highlighting a trade-off between accuracy and computational efficiency. Our key contribution is the first knowledge-augmented, interpretable LLM-based prediction framework specifically designed for human-vehicle interaction at unsignalized intersections.

Technology Category

Application Category

📝 Abstract
Pedestrian safety is a critical component of urban mobility and is strongly influenced by the interactions between pedestrian decision-making and driver yielding behavior at crosswalks. Modeling driver--pedestrian interactions at intersections requires accurately capturing the complexity of these behaviors. Traditional machine learning models often struggle to capture the nuanced and context-dependent reasoning required for these multifactorial interactions, due to their reliance on fixed feature representations and limited interpretability. In contrast, large language models (LLMs) are suited for extracting patterns from heterogeneous traffic data, enabling accurate modeling of driver-pedestrian interactions. Therefore, this paper leverages multimodal LLMs through a novel prompt design that incorporates domain-specific knowledge, structured reasoning, and few-shot prompting, enabling interpretable and context-aware inference of driver yielding behavior, as an example application of modeling pedestrian--driver interaction. We benchmarked state-of-the-art LLMs against traditional classifiers, finding that GPT-4o consistently achieves the highest accuracy and recall, while Deepseek-V3 excels in precision. These findings highlight the critical trade-offs between model performance and computational efficiency, offering practical guidance for deploying LLMs in real-world pedestrian safety systems.
Problem

Research questions and friction points this paper is trying to address.

Modeling complex driver-pedestrian interactions at unsignalized intersections
Capturing nuanced context-dependent reasoning for multifactorial traffic behaviors
Overcoming traditional models' limitations in interpretability and feature representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal LLMs with domain-specific prompt design
Structured reasoning and few-shot prompting techniques
Benchmarking LLMs against traditional machine learning classifiers
🔎 Similar Papers
No similar papers found.
Yicheng Yang
Yicheng Yang
Cornell University
Software EngineeringML Fairness
Z
Zixian Li
School of Artificial Intelligence, Hebei University of Technology
J
Jean Paul Bizimana
Department of Civil Engineering, Saint Louis University
N
Niaz Zafri
Department of Urban Studies and Planning, Massachusetts Institute of Technology
Y
Yongfeng Dong
Hebei Province Key Laboratory of Big Data Computing, Hebei University of Technology
T
Tianyi Li
Department of Civil Engineering, Saint Louis University