🤖 AI Summary
This work addresses the challenge of supervised training in PET super-resolution, which is hindered by the absence of within-subject paired high- and low-resolution data and physical limitations of scanners, often leading to structural hallucinations. The authors formulate the task as a posterior inference problem under a heterogeneous imaging system and, for the first time, jointly incorporate CT-derived anatomical priors and an explicit scanner physical model—including a point spread function—without requiring paired PET data. They propose a cross-attention conditional diffusion model coupled with gradient-driven data consistency optimization to achieve reconstructions that are both measurement-consistent and structurally faithful. The method significantly outperforms strong baselines in both standard and out-of-distribution settings, yielding improved quantitative metrics, enhanced lesion-level clinical relevance, and effective suppression of artifacts.
📝 Abstract
PET super-resolution is highly under-constrained because paired multi-resolution scans from the same subject are rarely available, and effective resolution is determined by scanner-specific physics (e.g., PSF, detector geometry, and acquisition settings). This limits supervised end-to-end training and makes purely image-domain generative restoration prone to hallucinated structures when anatomical and physical constraints are weak. We formulate PET super-resolution as posterior inference under heterogeneous system configurations and propose a CT-conditioned diffusion framework with physics-constrained sampling. During training, a conditional diffusion prior is learned from high-quality PET/CT pairs using cross-attention for anatomical guidance, without requiring paired LR--HR PET data. During inference, measurement consistency is enforced through a scanner-aware forward model with explicit PSF effects and gradient-based data-consistency refinement. Under both standard and OOD settings, the proposed method consistently improves experimental metrics and lesion-level clinical relevance indicators over strong baselines, while reducing hallucination artifacts and improving structural fidelity.