π€ AI Summary
This work addresses the challenges in solving imaging inverse problems arising from ill-conditioned sensing matrices imposed by physical constraints, which hinder data fidelity optimization and limit reconstruction quality. To overcome these limitations, the authors propose a teacher-guided knowledge distillation framework for preconditioning operator optimization. Notably, this approach leverages knowledge distillation not for model compression but to enhance reconstruction performance: a student algorithm, constrained to use a physically realizable sensing matrix, is trained to emulate the behavior of a teacher algorithm operating under an ideal (well-conditioned) sensing matrix. The framework supports both linear (interpretable) and nonlinear (scalable) preconditioner designs and demonstrates significant improvements in reconstruction accuracy and numerical stability across diverse applications, including MRI, compressive sensing, and super-resolution.
π Abstract
Solving imaging inverse problems has usually been addressed by designing proper prior models of the underlying signal. However, minimizing the data fidelity term poses significant challenges due to the ill-conditioned sensing matrix caused by physical constraints in the acquisition system. Thus, preconditioning techniques have been adopted in classical optimization theory to address ill-conditioned data-fidelity minimization by transforming the algorithm gradient step to achieve faster convergence and better numerical stability. We extend the preconditioning concept beyond convergence acceleration and use it to improve reconstruction quality. We introduce DIPA: Distilled Preconditioned Algorithms, where a preconditioning operator (PO) is optimized using teacher-guided distillation criteria. Unlike standard model-compression KD, the teacher and student differ by the sensing operators available during reconstruction: the teacher uses a simulated, better-conditioned, and more informative sensing matrix, whereas the student uses the physically feasible sensing matrix. We design different distillation loss functions to transfer different properties of the teacher algorithm to the preconditioned student. The PO can be linear (L-DIPA), allowing interpretability, or non-linear (N-DIPA), parametrized by a neural network, offering better scalability. We validate the proposed PO design across several imaging modalities, including magnetic resonance imaging, compressed sensing, and super-resolution imaging.