Identity-preserving Distillation Sampling by Fixed-Point Iterator

πŸ“… 2025-02-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In text-conditioned image/3D generation, Score Distillation Sampling (SDS) often yields blurry edits and identity distortion (e.g., pose or structural misalignment) due to noisy gradient estimates. To address this, we propose an identity-preserving distillation sampling framework centered on Fixed-Point Regularization (FPR)β€”the first method to directly regularize the text-conditioned score function within the SDS paradigm, enabling self-calibration of gradient bias without requiring reference image pairs and thereby ensuring identity consistency before and after editing. Our approach significantly improves structural fidelity and detail sharpness in both text-driven image editing and editable Neural Radiance Fields (NeRFs), effectively suppressing blur and identity drift. Quantitative and qualitative evaluations demonstrate consistent superiority over existing state-of-the-art methods across multiple metrics.

Technology Category

Application Category

πŸ“ Abstract
Score distillation sampling (SDS) demonstrates a powerful capability for text-conditioned 2D image and 3D object generation by distilling the knowledge from learned score functions. However, SDS often suffers from blurriness caused by noisy gradients. When SDS meets the image editing, such degradations can be reduced by adjusting bias shifts using reference pairs, but the de-biasing techniques are still corrupted by erroneous gradients. To this end, we introduce Identity-preserving Distillation Sampling (IDS), which compensates for the gradient leading to undesired changes in the results. Based on the analysis that these errors come from the text-conditioned scores, a new regularization technique, called fixed-point iterative regularization (FPR), is proposed to modify the score itself, driving the preservation of the identity even including poses and structures. Thanks to a self-correction by FPR, the proposed method provides clear and unambiguous representations corresponding to the given prompts in image-to-image editing and editable neural radiance field (NeRF). The structural consistency between the source and the edited data is obviously maintained compared to other state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Reduces blurriness in image generation
Preserves identity in image editing
Enhances structural consistency in edits
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fixed-point iterative regularization
Identity-preserving Distillation Sampling
Self-correction by FPR
πŸ”Ž Similar Papers
No similar papers found.