🤖 AI Summary
This work addresses the challenge of achieving real-time video denoising under realistic camera noise, which typically comprises multiple complex sensor noise components. To this end, the authors propose PocketDVDNet, a lightweight network that integrates sparse-guided structured pruning, physics-informed noise modeling, and knowledge distillation with implicit noise handling—eliminating the need for explicit noise map inputs. This approach enables substantial model compression while enhancing denoising performance. Experimental results demonstrate that PocketDVDNet reduces model size by 74% compared to the original architecture, achieves real-time processing on five-frame image patches, and improves denoising quality. The method effectively balances computational efficiency and reconstruction accuracy, making it well-suited for latency-sensitive applications such as autofocus systems, autonomous driving, and real-time video surveillance.
📝 Abstract
Live video denoising under realistic, multi-component sensor noise remains challenging for applications such as autofocus, autonomous driving, and surveillance. We propose PocketDVDNet, a lightweight video denoiser developed using our model compression framework that combines sparsity-guided structured pruning, a physics-informed noise model, and knowledge distillation to achieve high-quality restoration with reduced resource demands. Starting from a reference model, we induce sparsity, apply targeted channel pruning, and retrain a teacher on realistic multi-component noise. The student network learns implicit noise handling, eliminating the need for explicit noise-map inputs. PocketDVDNet reduces the original model size by 74% while improving denoising quality and processing 5-frame patches in real-time. These results demonstrate that aggressive compression, combined with domain-adapted distillation, can reconcile performance and efficiency for practical, real-time video denoising.