Accelerating Learned Image Compression Through Modeling Neural Training Dynamics

📅 2025-05-23

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

To address the high computational cost and slow convergence in training learned image compression (LIC) models, this paper proposes an efficient training framework grounded in neural training dynamics modeling. Our method introduces two key components: (1) a sensitivity-aware true-false embedding training (STDET) mechanism that compresses the parameter space via sensitivity analysis and parameter-mode clustering; and (2) a sampling-based moving average (SMA) strategy, which integrates SGD weight sampling with theoretical analysis from noise-regularized quadratic models to simultaneously reduce training memory footprint and trainable parameter count while suppressing gradient variance. We theoretically prove that SMA yields strictly lower variance than standard SGD. Experiments demonstrate that our approach achieves state-of-the-art rate-distortion performance while accelerating convergence, reducing model parameters by approximately 40% and GPU memory consumption by 35%.

Technology Category

Application Category

📝 Abstract

As learned image compression (LIC) methods become increasingly computationally demanding, enhancing their training efficiency is crucial. This paper takes a step forward in accelerating the training of LIC methods by modeling the neural training dynamics. We first propose a Sensitivity-aware True and Dummy Embedding Training mechanism (STDET) that clusters LIC model parameters into few separate modes where parameters are expressed as affine transformations of reference parameters within the same mode. By further utilizing the stable intra-mode correlations throughout training and parameter sensitivities, we gradually embed non-reference parameters, reducing the number of trainable parameters. Additionally, we incorporate a Sampling-then-Moving Average (SMA) technique, interpolating sampled weights from stochastic gradient descent (SGD) training to obtain the moving average weights, ensuring smooth temporal behavior and minimizing training state variances. Overall, our method significantly reduces training space dimensions and the number of trainable parameters without sacrificing model performance, thus accelerating model convergence. We also provide a theoretical analysis on the Noisy quadratic model, showing that the proposed method achieves a lower training variance than standard SGD. Our approach offers valuable insights for further developing efficient training methods for LICs.

Problem

Research questions and friction points this paper is trying to address.

Accelerating training of learned image compression methods

Reducing trainable parameters without performance loss

Modeling neural dynamics to enhance training efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sensitivity-aware parameter clustering for efficiency

Sampling-then-Moving Average for smooth training

Reduced trainable parameters without performance loss

🔎 Similar Papers

Unified Framework for Neural Network Compression via Decomposition and Optimal Rank Selection