๐ค AI Summary
This work addresses key limitations of quantization-aware training and knowledge distillation in image restoration, including teacherโstudent capacity mismatch, spatial error amplification caused by decoder-side distillation, and conflict between reconstruction and distillation losses due to quantization noise. To overcome these challenges, the authors propose the QDR framework, which eliminates capacity gaps via FP32 self-distillation, introduces decoder-free distillation (DFD) at the bottleneck to correct quantization errors, and employs learnable magnitude reweighting (LMR) to dynamically balance gradient conflicts. Additionally, a learnable degradation gating (LDG) module is incorporated to enhance robustness. The proposed method recovers 96.5% of FP32 performance under INT8 quantization, achieves 442 FPS on an NVIDIA Jetson Orin, and improves downstream object detection by 16.3 mAP.
๐ Abstract
Quantization-Aware Training (QAT), combined with Knowledge Distillation (KD), holds immense promise for compressing models for edge deployment. However, joint optimization for precision-sensitive image restoration (IR) to recover visual quality from degraded images remains largely underexplored. Directly adapting QAT-KD to low-level vision reveals three critical bottlenecks: teacher-student capacity mismatch, spatial error amplification during decoder distillation, and an optimization "tug-of-war" between reconstruction and distillation losses caused by quantization noise. To tackle these, we introduce Quantization-aware Distilled Restoration (QDR), a framework for edge-deployed IR. QDR eliminates capacity mismatch via FP32 self-distillation and prevents error amplification through Decoder-Free Distillation (DFD), which corrects quantization errors strictly at the network bottleneck. To stabilize the optimization tug-of-war, we propose a Learnable Magnitude Reweighting (LMR) that dynamically balances competing gradients. Finally, we design an Edge-Friendly Model (EFM) featuring a lightweight Learnable Degradation Gating (LDG) to dynamically modulate spatial degradation localization. Extensive experiments across four IR tasks demonstrate that our Int8 model recovers 96.5% of FP32 performance, achieves 442 frames per second (FPS) on an NVIDIA Jetson Orin, and boosts downstream object detection by 16.3 mAP