Local Patches Meet Global Context: Scalable 3D Diffusion Priors for Computed Tomography Reconstruction

📅 2025-12-19

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the computationally and data-constrained inverse problem of high-resolution 3D CT reconstruction. We propose a scalable, patch-based 3D diffusion prior modeling framework. Methodologically, we introduce the first position-aware joint distribution modeling of local 3D patches and downsampled global voxels—overcoming the limitations of 2D prior transfer and enabling full 3D diffusion prior learning from only a small number of CT scans. We further incorporate positional encoding and multi-scale contextual coupling, and design a progressive volumetric sampling strategy for reconstruction. Evaluated on multiple public CT datasets, our method achieves state-of-the-art performance, supporting high-resolution reconstructions up to 512×512×256 (≈20 minutes per inference), with significant improvements in both reconstruction fidelity and inference efficiency.

Technology Category

Application Category

📝 Abstract

Diffusion models learn strong image priors that can be leveraged to solve inverse problems like medical image reconstruction. However, for real-world applications such as 3D Computed Tomography (CT) imaging, directly training diffusion models on 3D data presents significant challenges due to the high computational demands of extensive GPU resources and large-scale datasets. Existing works mostly reuse 2D diffusion priors to address 3D inverse problems, but fail to fully realize and leverage the generative capacity of diffusion models for high-dimensional data. In this study, we propose a novel 3D patch-based diffusion model that can learn a fully 3D diffusion prior from limited data, enabling scalable generation of high-resolution 3D images. Our core idea is to learn the prior of 3D patches to achieve scalable efficiency, while coupling local and global information to guarantee high-quality 3D image generation, by modeling the joint distribution of position-aware 3D local patches and downsampled 3D volume as global context. Our approach not only enables high-quality 3D generation, but also offers an unprecedentedly efficient and accurate solution to high-resolution 3D inverse problems. Experiments on 3D CT reconstruction across multiple datasets show that our method outperforms state-of-the-art methods in both performance and efficiency, notably achieving high-resolution 3D reconstruction of $512 imes 512 imes 256$ ($sim$20 mins).

Problem

Research questions and friction points this paper is trying to address.

Develops a 3D patch-based diffusion model for CT reconstruction

Learns 3D priors from limited data to overcome computational constraints

Couples local patches with global context for scalable high-resolution generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

3D patch-based diffusion model for CT reconstruction

Combines local patches with global context modeling

Efficient high-resolution 3D generation from limited data

🔎 Similar Papers

Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems