🤖 AI Summary
Existing post-training quantization methods for super-resolution models typically decouple weight and activation quantization, neglecting their intrinsic coupling—leading to substantial degradation in structural similarity and pixel-level accuracy. To address this, we propose HarmoQ, a unified quantization framework that systematically models the interdependence between weight and activation quantization for the first time. HarmoQ introduces a three-stage mechanism: (i) structural residual calibration to preserve fine-grained reconstruction details; (ii) closed-form harmonized scale optimization to align quantized weight and activation distributions; and (iii) iterative adaptive boundary refinement to suppress error accumulation. Under extreme 2-bit quantization, HarmoQ achieves a PSNR gain of +0.46 dB over state-of-the-art methods on Set5. On an A100 GPU, it delivers 3.2× inference speedup and 4× memory compression, significantly enhancing both reconstruction fidelity and deployment efficiency of low-bit super-resolution models.
📝 Abstract
Post-training quantization offers an efficient pathway to deploy super-resolution models, yet existing methods treat weight and activation quantization independently, missing their critical interplay. Through controlled experiments on SwinIR, we uncover a striking asymmetry: weight quantization primarily degrades structural similarity, while activation quantization disproportionately affects pixel-level accuracy. This stems from their distinct roles--weights encode learned restoration priors for textures and edges, whereas activations carry input-specific intensity information. Building on this insight, we propose HarmoQ, a unified framework that harmonizes quantization across components through three synergistic steps: structural residual calibration proactively adjusts weights to compensate for activation-induced detail loss, harmonized scale optimization analytically balances quantization difficulty via closed-form solutions, and adaptive boundary refinement iteratively maintains this balance during optimization. Experiments show HarmoQ achieves substantial gains under aggressive compression, outperforming prior art by 0.46 dB on Set5 at 2-bit while delivering 3.2x speedup and 4x memory reduction on A100 GPUs. This work provides the first systematic analysis of weight-activation coupling in super-resolution quantization and establishes a principled solution for efficient high-quality image restoration.