Local Diffusion Models and Phases of Data Distributions

📅 2025-08-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models often ignore local structural priors in data (e.g., natural images), forcing them to learn global score functions—resulting in high computational overhead. Method: This paper introduces the novel concept of “data distribution phases,” revealing a sharp phase transition—from a trivial phase to a data-concentrated phase—during denoising. We theoretically prove that outside the phase-transition interval, local denoisers achieve the information-theoretic performance bound. Leveraging this insight, we propose a phase-aware hybrid architecture that dynamically switches between local and global modules. Using score-function modeling, conditional mutual information bounds, and phase-transition diagnostics, we empirically validate the phenomenon on real image datasets. Contribution/Results: Our approach demonstrates that local networks can efficiently replace global ones post-transition, significantly reducing computation while preserving fidelity—establishing a new paradigm for lightweight diffusion modeling.

Technology Category

Application Category

📝 Abstract
As a class of generative artificial intelligence frameworks inspired by statistical physics, diffusion models have shown extraordinary performance in synthesizing complicated data distributions through a denoising process gradually guided by score functions. Real-life data, like images, is often spatially structured in low-dimensional spaces. However, ordinary diffusion models ignore this local structure and learn spatially global score functions, which are often computationally expensive. In this work, we introduce a new perspective on the phases of data distributions, which provides insight into constructing local denoisers with reduced computational costs. We define two distributions as belonging to the same data distribution phase if they can be mutually connected via spatially local operations such as local denoisers. Then, we show that the reverse denoising process consists of an early trivial phase and a late data phase, sandwiching a rapid phase transition where local denoisers must fail. To diagnose such phase transitions, we prove an information-theoretic bound on the fidelity of local denoisers based on conditional mutual information, and conduct numerical experiments in a real-world dataset. This work suggests simpler and more efficient architectures of diffusion models: far from the phase transition point, we can use small local neural networks to compute the score function; global neural networks are only necessary around the narrow time interval of phase transitions. This result also opens up new directions for studying phases of data distributions, the broader science of generative artificial intelligence, and guiding the design of neural networks inspired by physics concepts.
Problem

Research questions and friction points this paper is trying to address.

Addressing high computational costs in global diffusion models
Introducing local denoisers for spatially structured data distributions
Identifying phase transitions in data distribution denoising processes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Local denoisers reduce computational costs
Phases of data distributions guide denoisers
Small local networks suffice far from transitions
🔎 Similar Papers
F
Fangjun Hu
Department of Electrical and Computer Engineering, Princeton University, Princeton, NJ 08544, USA; QuEra Computing Inc., 1284 Soldiers Field Road, Boston, MA 02135, USA
G
Guangkuo Liu
JILA and Department of Physics, University of Colorado Boulder, Boulder, CO 80309, USA
Y
Yifan Zhang
Department of Electrical and Computer Engineering, Princeton University, Princeton, NJ 08544, USA
Xun Gao
Xun Gao
JILA and Physics Department at CU Boulder
quantum information theory