🤖 AI Summary
This work addresses the lack of sample-level confidence estimation in generative models, which often leads to unreliable outputs. The authors propose Confidence-aware Flow Matching (FMwC), a novel approach that injects input-dependent multiplicative noise at specific layers, enabling closed-form computation of confidence scores without additional sampling overhead. FMwC is the first method to support variance propagation and integration along ODE trajectories at no extra cost, revealing a theoretical connection between confidence and the divergence of the velocity field. Experimental results demonstrate that FMwC significantly improves both image generation quality and thermodynamic stability in crystal generation, while also enabling trajectory backtracking for editing and efficient sampling focused on critical regions.
📝 Abstract
Generative models can produce nonsensical text, unrealistic images, and unstable materials faster than simulation or human review can absorb; without per-sample confidence, trust erodes. Existing fixes run $k$ ensembles or stochastic trajectories at $k\times$ compute, measuring variability between models, not model confidence. We propose Flow Matching with Confidence (FMwC). FMwC injects input-dependent multiplicative noise at selected layers, propagates its variance through the network in closed form, and integrates it along the ODE trajectory, yielding a per-sample confidence score at standard sampling cost. The score supports multiple uses: filtering improves image quality and thermodynamic stability of crystals; editing rewinds trajectories to the points where the model commits and redirects them; and adaptive stepping concentrates ODE compute where the flow is ambiguous. We find that the confidence score correlates with the magnitude of the divergence of the learned velocity field, which gives us a window to understand the generative process, opening up surgical forms of guidance that target the moments that matter, new sampling algorithms and interpretability of generative models.