🤖 AI Summary
Diffusion model sampling is prone to hallucinations due to biased score estimation, yet existing methods lack online identification and correction of high-risk sampling steps. This paper proposes RODS—a robust diffusion sampling framework that requires no retraining—introducing robust optimization principles into the sampling process for the first time. RODS dynamically assesses sampling risk by modeling the geometric structure of the loss manifold and adaptively injects perturbations to rectify anomalous trajectories. Evaluated on AFHQv2, FFHQ, and 11k-hands, RODS detects over 70% of hallucinated samples and successfully corrects more than 25% of them, significantly improving fidelity and robustness. Crucially, it incurs negligible inference overhead and introduces no additional artifacts.
📝 Abstract
Diffusion models have achieved state-of-the-art performance in generative modeling, yet their sampling procedures remain vulnerable to hallucinations, often stemming from inaccuracies in score approximation. In this work, we reinterpret diffusion sampling through the lens of optimization and introduce RODS (Robust Optimization-inspired Diffusion Sampler), a novel method that detects and corrects high-risk sampling steps using geometric cues from the loss landscape. RODS enforces smoother sampling trajectories and adaptively adjusts perturbations, reducing hallucinations without retraining and at minimal additional inference cost. Experiments on AFHQv2, FFHQ, and 11k-hands demonstrate that RODS improves both sampling fidelity and robustness, detecting over 70% of hallucinated samples and correcting more than 25%, all while avoiding the introduction of new artifacts.