🤖 AI Summary
Existing Rectified Flow models are prone to geometric locking or numerical instability in tasks such as semantic editing and blind image restoration. This work reformulates the restoration problem within a proximal optimization framework and introduces the SGPP (Score-Guided Proximal Projection) method, which unifies deterministic optimization and stochastic sampling through score-guided proximal projection and manifold-constrained optimization. The proposed approach exhibits normal contraction properties, enabling stable mapping of out-of-distribution inputs onto the data manifold and allowing for continuous, training-free adjustment of guidance strength. Theoretically, it is shown to converge to the posterior mode under manifold constraints, enhancing generative flexibility while preserving identity consistency, and generalizing current state-of-the-art editing techniques.
📝 Abstract
Rectified Flow (RF) models achieve state-of-the-art generation quality, yet controlling them for precise tasks -- such as semantic editing or blind image recovery -- remains a challenge. Current approaches bifurcate into inversion-based guidance, which suffers from"geometric locking"by rigidly adhering to the source trajectory, and posterior sampling approximations (e.g., DPS), which are computationally expensive and unstable. In this work, we propose Score-Guided Proximal Projection (SGPP), a unified framework that bridges the gap between deterministic optimization and stochastic sampling. We reformulate the recovery task as a proximal optimization problem, defining an energy landscape that balances fidelity to the input with realism from the pre-trained score field. We theoretically prove that this objective induces a normal contraction property, geometrically guaranteeing that out-of-distribution inputs are snapped onto the data manifold, and it effectively reaches the posterior mode constrained to the manifold. Crucially, we demonstrate that SGPP generalizes state-of-the-art editing methods: RF-inversion is effectively a limiting case of our framework. By relaxing the proximal variance, SGPP enables"soft guidance,"offering a continuous, training-free trade-off between strict identity preservation and generative freedom.