🤖 AI Summary
This work addresses the exponential dependence of sample complexity on dimensionality in score function estimation for diffusion models. Methodologically, we propose a joint time-step estimation framework incorporating Bootstrapped Score Matching (BSM) for variance reduction, coupled with martingale error decomposition, sharp variance analysis, and Markov-dependent data modeling to enable cooperative learning of score functions across multiple noise levels using a single neural network. Theoretically, we establish the first nearly dimension-independent upper bound on sample complexity—improving the dimensionality dependence from doubly exponential to near-constant. Empirically, BSM significantly enhances estimation accuracy in high-noise regimes, leading to improved generative quality and training stability. Our results provide both a scalable theoretical foundation and a practical algorithm for high-dimensional generative modeling.
📝 Abstract
Diffusion models generate samples by estimating the score function of the target distribution at various noise levels. The model is trained using samples drawn from the target distribution, progressively adding noise. In this work, we establish the first (nearly) dimension-free sample complexity bounds for learning these score functions, achieving a double exponential improvement in dimension over prior results. A key aspect of our analysis is the use of a single function approximator to jointly estimate scores across noise levels, a critical feature of diffusion models in practice which enables generalization across timesteps. Our analysis introduces a novel martingale-based error decomposition and sharp variance bounds, enabling efficient learning from dependent data generated by Markov processes, which may be of independent interest. Building on these insights, we propose Bootstrapped Score Matching (BSM), a variance reduction technique that utilizes previously learned scores to improve accuracy at higher noise levels. These results provide crucial insights into the efficiency and effectiveness of diffusion models for generative modeling.