Hybrid Approach for Driver Behavior Analysis with Machine Learning, Feature Optimization, and Explainable AI

📅 2026-01-07
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of balancing model performance and interpretability in traditional driving behavior analysis, which often suffers from suboptimal feature engineering. To overcome this limitation, the authors propose a novel modeling framework that integrates LIME (Local Interpretable Model-agnostic Explanations) with a feature retraining mechanism. Building upon a systematic preprocessing pipeline—including label encoding, random oversampling, and standardization—the framework comprehensively evaluates thirteen machine learning algorithms. Experimental results demonstrate that the selected random forest model achieves an initial accuracy of 95%, which remains high at 94.2% after incorporating LIME. Furthermore, the approach successfully identifies the top ten most influential driving features affecting predictions, thereby substantially enhancing model transparency and practical utility.

Technology Category

Application Category

📝 Abstract
Progressive driver behavior analytics is crucial for improving road safety and mitigating the issues caused by aggressive or inattentive driving. Previous studies have employed machine learning and deep learning techniques, which often result in low feature optimization, thereby compromising both high performance and interpretability. To fill these voids, this paper proposes a hybrid approach to driver behavior analysis that uses a 12,857-row and 18-column data set taken from Kaggle. After applying preprocessing techniques such as label encoding, random oversampling, and standard scaling, 13 machine learning algorithms were tested. The Random Forest Classifier achieved an accuracy of 95%. After deploying the LIME technique in XAI, the top 10 features with the most significant positive and negative influence on accuracy were identified, and the same algorithms were retrained. The accuracy of the Random Forest Classifier decreased slightly to 94.2%, confirming that the efficiency of the model can be improved without sacrificing performance. This hybrid model can provide a return on investment in terms of the predictive power and explainability of the driver behavior process.
Problem

Research questions and friction points this paper is trying to address.

driver behavior analysis
feature optimization
machine learning
explainable AI
interpretability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Explainable AI
Feature Optimization
Driver Behavior Analysis
Random Forest
LIME
🔎 Similar Papers
No similar papers found.
M
Mehedi Hasan Shuvo
Department of CSE, DUET, Gazipur, Bangladesh
M
Md. Raihan Tapader
Department of CSE, DUET, Gazipur, Bangladesh
N
Nur Mohammad Tamjid
Department of CSE, DUET, Gazipur, Bangladesh
S
Sajjadul Islam
Department of CSE, DUET, Gazipur, Bangladesh
Ahnaf Atef Choudhury
Ahnaf Atef Choudhury
PhD in Information Technology, George Mason University
Data ScienceApplied Machine LearningImage ProcessingMedical InformaticsNatural Language Processing
Jia Uddin
Jia Uddin
Woosong University
Fault Diagnosis using MLDL and TLMultimedia signal Processing