Hybrid Approach for Driver Behavior Analysis with Machine Learning, Feature Optimization, and Explainable AI

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

This study addresses the challenge of balancing model performance and interpretability in traditional driving behavior analysis, which often suffers from suboptimal feature engineering. To overcome this limitation, the authors propose a novel modeling framework that integrates LIME (Local Interpretable Model-agnostic Explanations) with a feature retraining mechanism. Building upon a systematic preprocessing pipeline—including label encoding, random oversampling, and standardization—the framework comprehensively evaluates thirteen machine learning algorithms. Experimental results demonstrate that the selected random forest model achieves an initial accuracy of 95%, which remains high at 94.2% after incorporating LIME. Furthermore, the approach successfully identifies the top ten most influential driving features affecting predictions, thereby substantially enhancing model transparency and practical utility.

Technology Category

Application Category

📝 Abstract

Progressive driver behavior analytics is crucial for improving road safety and mitigating the issues caused by aggressive or inattentive driving. Previous studies have employed machine learning and deep learning techniques, which often result in low feature optimization, thereby compromising both high performance and interpretability. To fill these voids, this paper proposes a hybrid approach to driver behavior analysis that uses a 12,857-row and 18-column data set taken from Kaggle. After applying preprocessing techniques such as label encoding, random oversampling, and standard scaling, 13 machine learning algorithms were tested. The Random Forest Classifier achieved an accuracy of 95%. After deploying the LIME technique in XAI, the top 10 features with the most significant positive and negative influence on accuracy were identified, and the same algorithms were retrained. The accuracy of the Random Forest Classifier decreased slightly to 94.2%, confirming that the efficiency of the model can be improved without sacrificing performance. This hybrid model can provide a return on investment in terms of the predictive power and explainability of the driver behavior process.

Problem

Research questions and friction points this paper is trying to address.

driver behavior analysis

feature optimization

machine learning

explainable AI

interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Explainable AI

Feature Optimization

Driver Behavior Analysis