🤖 AI Summary
Medical image enhancement relies on large-scale datasets, yet existing methods incur high training and storage costs while struggling to simultaneously preserve pixel-level fidelity and ensure patient privacy. To address this, we introduce dataset distillation—previously unexplored in low-level (many-to-many) medical image enhancement—via Structure-Preserving Personalized Generation (SPG). SPG constructs a shared anatomical prior from representative patients and injects individualized knowledge through gradient alignment, synthesizing task-specific high- and low-quality image pairs. By retaining only abstract training signals, SPG drastically reduces data storage and computational overhead; critically, the distilled data is provably irreversible to original images, ensuring privacy compliance. Empirically, SPG achieves performance comparable to full-dataset training across multiple enhancement tasks. This work establishes the first framework for dataset distillation in low-level medical vision tasks, offering a novel paradigm for lightweight, privacy-preserving medical image modeling.
📝 Abstract
Medical image enhancement is clinically valuable, but existing methods require large-scale datasets to learn complex pixel-level mappings. However, the substantial training and storage costs associated with these datasets hinder their practical deployment. While dataset distillation (DD) can alleviate these burdens, existing methods mainly target high-level tasks, where multiple samples share the same label. This many-to-one mapping allows distilled data to capture shared semantics and achieve information compression. In contrast, low-level tasks involve a many-to-many mapping that requires pixel-level fidelity, making low-level DD an underdetermined problem, as a small distilled dataset cannot fully constrain the dense pixel-level mappings. To address this, we propose the first low-level DD method for medical image enhancement. We first leverage anatomical similarities across patients to construct the shared anatomical prior based on a representative patient, which serves as the initialization for the distilled data of different patients. This prior is then personalized for each patient using a Structure-Preserving Personalized Generation (SPG) module, which integrates patient-specific anatomical information into the distilled dataset while preserving pixel-level fidelity. For different low-level tasks, the distilled data is used to construct task-specific high- and low-quality training pairs. Patient-specific knowledge is injected into the distilled data by aligning the gradients computed from networks trained on the distilled pairs with those from the corresponding patient's raw data. Notably, downstream users cannot access raw patient data. Instead, only a distilled dataset containing abstract training information is shared, which excludes patient-specific details and thus preserves privacy.