🤖 AI Summary
Existing driving style recognition methods rely solely on low-level sensor data and fail to model human experts’ semantic reasoning capabilities, leading to discrepancies between algorithmic judgments and expert cognition.
Method: We propose the first approach to incorporate natural-language behavioral descriptions—generated by a large language model (DriBehavGPT)—as privileged semantic information during training, enabling alignment between algorithmic outputs and human cognitive representations. Semantic features are integrated via text embedding, dimensionality reduction, and an SVM+ framework, without increasing inference overhead; inference still uses only raw sensor inputs.
Contribution/Results: Evaluated on real-world driving data, our method improves F1 scores by 7.6% in car-following and 7.9% in lane-changing scenarios over conventional approaches. This work establishes the first learnable mapping between driving behavior perception and semantic understanding, introducing a novel paradigm for interpretable, cognition-aligned intelligent driving recognition.
📝 Abstract
Existing driving style recognition systems largely depend on low-level sensor-derived features for training, neglecting the rich semantic reasoning capability inherent to human experts. This discrepancy results in a fundamental misalignment between algorithmic classifications and expert judgments. To bridge this gap, we propose a novel framework that integrates Semantic Privileged Information (SPI) derived from large language models (LLMs) to align recognition outcomes with human-interpretable reasoning. First, we introduce DriBehavGPT, an interactive LLM-based module that generates natural-language descriptions of driving behaviors. These descriptions are then encoded into machine learning-compatible representations via text embedding and dimensionality reduction. Finally, we incorporate them as privileged information into Support Vector Machine Plus (SVM+) for training, enabling the model to approximate human-like interpretation patterns. Experiments across diverse real-world driving scenarios demonstrate that our SPI-enhanced framework outperforms conventional methods, achieving F1-score improvements of 7.6% (car-following) and 7.9% (lane-changing). Importantly, SPI is exclusively used during training, while inference relies solely on sensor data, ensuring computational efficiency without sacrificing performance. These results highlight the pivotal role of semantic behavioral representations in improving recognition accuracy while advancing interpretable, human-centric driving systems.