YawDD+: Frame-level Annotations for Accurate Yawn Prediction

📅 2025-12-12

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

Frequent road accidents caused by driver fatigue are exacerbated by coarse temporal granularity in existing video-level annotated datasets—where only whole video clips are labeled as containing yawning—introducing significant temporal noise and severely limiting yawning recognition accuracy. To address this, we introduce YawDD+, the first frame-level fine-grained yawning dataset, built via a human-in-the-loop semi-automatic annotation pipeline enabling precise start/end-frame labeling. Leveraging this high-quality data, we design a lightweight MNasNet-based classifier and a YOLOv11-based detector, both optimized for deployment on the NVIDIA Jetson Nano edge platform. Experiments demonstrate state-of-the-art performance: 99.34% frame-level classification accuracy and 95.69% mAP for yawning detection—improving upon video-level supervision by +6.0% accuracy and +5.0% mAP—while achieving real-time inference at 59.8 FPS. These results empirically validate that enhanced annotation granularity is decisive for robust, low-latency driver fatigue monitoring.

Technology Category

Application Category

📝 Abstract

Driver fatigue remains a leading cause of road accidents, with 24% of crashes involving drowsy drivers. While yawning serves as an early behavioral indicator of fatigue, existing machine learning approaches face significant challenges due to video-annotated datasets that introduce systematic noise from coarse temporal annotations. We develop a semi-automated labeling pipeline with human-in-the-loop verification, which we apply to YawDD, enabling more accurate model training. Training the established MNasNet classifier and YOLOv11 detector architectures on YawDD+ improves frame accuracy by up to 6% and mAP by 5% over video-level supervision, achieving 99.34% classification accuracy and 95.69% detection mAP. The resulting approach deliver up to 59.8 FPS on edge AI hardware (NVIDIA Jetson Nano), confirming that enhanced data quality alone supports on-device yawning monitoring without server-side computation.

Problem

Research questions and friction points this paper is trying to address.

Improves yawn detection accuracy via frame-level annotations

Reduces systematic noise from coarse video annotations

Enables real-time on-device monitoring for driver fatigue

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semi-automated labeling pipeline with human verification

Training MNasNet and YOLOv11 on improved YawDD+ dataset

Achieving high accuracy and FPS on edge AI hardware

🔎 Similar Papers

No similar papers found.