Adaptive Vehicle Speed Classification via BMCNN with Reinforcement Learning-Enhanced Acoustic Processing

๐Ÿ“… 2025-08-31
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the trade-off between classification accuracy and inference efficiency in real-time acoustic traffic speed monitoring, this paper proposes an adaptive audio frame selection framework integrating deep learning and reinforcement learning. Methodologically, a dual-branch multi-scale convolutional neural network (BMCNN) is designed to extract MFCC and wavelet time-frequency features in parallel; an attention-enhanced deep Q-network (DQN) is introduced to enable dynamic frame sampling and early decision-making. The key contribution lies in embedding reinforcement learning directly into the classification pipeline, enabling end-to-end optimization of frame selectionโ€”thus achieving high accuracy without sacrificing efficiency. Evaluations on the IDMT-Traffic and SZUR-Acoustic datasets yield classification accuracies of 95.99% and 92.3%, respectively, with an average 1.63ร— speedup in processing latency. Our approach consistently outperforms baseline methods including A3C, Dueling Double DQN (DDDQN), Self-Attention Actor-Critic (SA2C), PPO, and TD3.

Technology Category

Application Category

๐Ÿ“ Abstract
Traffic congestion remains a pressing urban challenge, requiring intelligent transportation systems for real-time management. We present a hybrid framework that combines deep learning and reinforcement learning for acoustic vehicle speed classification. A dual-branch BMCNN processes MFCC and wavelet features to capture complementary frequency patterns. An attention-enhanced DQN adaptively selects the minimal number of audio frames and triggers early decisions once confidence thresholds are reached. Evaluations on IDMT-Traffic and our SZUR-Acoustic (Suzhou) datasets show 95.99% and 92.3% accuracy, with up to 1.63x faster average processing via early termination. Compared with A3C, DDDQN, SA2C, PPO, and TD3, the method provides a superior accuracy-efficiency trade-off and is suitable for real-time ITS deployment in heterogeneous urban environments.
Problem

Research questions and friction points this paper is trying to address.

Classifying vehicle speed using acoustic data in urban traffic
Reducing computational latency for real-time traffic management systems
Improving accuracy-efficiency trade-off in intelligent transportation deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

BMCNN processes MFCC and wavelet features
Attention-enhanced DQN selects audio frames
Early termination achieves faster processing