Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning

📅 2024-11-20
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address poor generalization in driver distraction classification across cameras caused by viewpoint variation, this paper proposes DBMNet—the first framework that jointly models viewpoint-invariant feature disentanglement and action-aware contrastive learning. DBMNet employs a differentiable disentanglement module to explicitly separate viewpoint-invariant representations from action semantics, and introduces a self-supervised contrastive loss coupled with multi-domain feature alignment. Built upon a lightweight backbone, the model achieves enhanced robustness across cameras, time periods (day/night), and datasets. On the 100-Driver day-night subset, DBMNet improves Top-1 accuracy by 9% over the state of the art. It also achieves superior performance on cross-domain evaluations across three major benchmarks—AUCDD-V1, EZZ2021, and SFD—demonstrating its strong domain generalization capability.

Technology Category

Application Category

📝 Abstract
The classification of distracted drivers is pivotal for ensuring safe driving. Previous studies demonstrated the effectiveness of neural networks in automatically predicting driver distraction, fatigue, and potential hazards. However, recent research has uncovered a significant loss of accuracy in these models when applied to samples acquired under conditions that differ from the training data. In this paper, we introduce a robust model designed to withstand changes in camera position within the vehicle. Our Driver Behavior Monitoring Network (DBMNet) relies on a lightweight backbone and integrates a disentanglement module to discard camera view information from features, coupled with contrastive learning to enhance the encoding of various driver actions. Experiments conducted on the daytime and nighttime subsets of the 100-Driver dataset validate the effectiveness of our approach with an increment on average of 9% in Top-1 accuracy in comparison with the state of the art. In addition, cross-dataset and cross-camera experiments conducted on three benchmark datasets, namely AUCDD-V1, EZZ2021 and SFD, demonstrate the superior generalization capability of the proposed method.
Problem

Research questions and friction points this paper is trying to address.

Improving accuracy in cross-camera distracted driver classification
Enhancing model generalization across varying camera positions
Reducing computational costs while maintaining high performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight backbone for efficient processing
Feature disentanglement to remove camera view bias
Contrastive learning enhances driver action encoding
🔎 Similar Papers
No similar papers found.