🤖 AI Summary
Under high guidance scales, Classifier-Free Guidance suffers from trajectory deviations off the data manifold due to extrapolation in Euclidean space, leading to oversaturation, textural artifacts, and structural collapse. This work reframes diffusion guidance as a locally optimal control problem through the lens of Riemannian geometry and introduces a geometrically aware, closed-form Riemannian update rule that effectively corrects manifold deviation without requiring retraining. Furthermore, a dynamic energy-balancing scheduling mechanism is devised to adaptively modulate guidance strength throughout sampling. The proposed approach incurs negligible computational overhead while significantly enhancing both image fidelity and conditional alignment, outperforming existing baselines.
📝 Abstract
Classifier-Free Guidance (CFG) serves as the de facto control mechanism for conditional diffusion, yet high guidance scales notoriously induce oversaturation, texture artifacts, and structural collapse. We attribute this failure to a geometric mismatch: standard CFG performs Euclidean extrapolation in ambient space, inadvertently driving sampling trajectories off the high-density data manifold. To resolve this, we present Manifold-Optimal Guidance (MOG), a framework that reformulates guidance as a local optimal control problem. MOG yields a closed-form, geometry-aware Riemannian update that corrects off-manifold drift without requiring retraining. Leveraging this perspective, we further introduce Auto-MOG, a dynamic energy-balancing schedule that adaptively calibrates guidance strength, effectively eliminating the need for manual hyperparameter tuning. Extensive validation demonstrates that MOG yields superior fidelity and alignment compared to baselines, with virtually no added computational overhead.