Reverse Attention for Lightweight Speech Enhancement on Edge Devices

📅 2025-09-20

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address real-time speech enhancement on resource-constrained edge devices, this paper proposes a lightweight U-Net architecture integrated with a reverse soft attention mechanism. The design significantly reduces model parameters while enhancing modeling capability for salient speech features, and enables low-latency, end-to-end inference with optimized GPU computational efficiency. Evaluated on standard benchmarks, the proposed method achieves a 0.64 PESQ improvement and a 6.24% relative reduction in word error rate (WER) over same-scale baseline models, alongside marked gains in speech intelligibility and subjective quality. The core contribution lies in embedding reverse attention into the encoder-decoder pathways of the lightweight U-Net, effectively balancing model compactness and enhancement performance. This yields an efficient, practical solution for real-time speech enhancement at the edge.

Technology Category

Application Category

📝 Abstract

This paper introduces a lightweight deep learning model for real-time speech enhancement, designed to operate efficiently on resource-constrained devices. The proposed model leverages a compact architecture that facilitates rapid inference without compromising performance. Key contributions include infusing soft attention-based attention gates in the U-Net architecture which is known to perform well for segmentation tasks and is optimized for GPUs. Experimental evaluations demonstrate that the model achieves competitive speech quality and intelligibility metrics, such as PESQ and Word Error Rates (WER), improving the performance of similarly sized baseline models. We are able to achieve a 6.24% WER improvement and a 0.64 PESQ score improvement over un-enhanced waveforms.

Problem

Research questions and friction points this paper is trying to address.

Lightweight real-time speech enhancement for resource-constrained edge devices

Improving speech quality and intelligibility metrics like PESQ and WER

Optimizing a compact U-Net architecture with attention gates for efficient inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight U-Net with attention gates

Real-time speech enhancement on edge

Improved PESQ and WER metrics

🔎 Similar Papers

No similar papers found.