🤖 AI Summary
This work addresses the longstanding trade-off in image restoration between generative methods, which suffer from slow inference and low fidelity, and regression-based approaches, which often lack realistic textures. To reconcile these limitations, the authors propose DiSI, a novel framework that explicitly decouples generative and regression pathways within stochastic interpolation, enabling a continuous and controllable transition from pure regression to fully generative restoration. DiSI introduces a dual-trajectory sampling strategy, a unified few-step sampler, a dual-branch U-Net-style Transformer, and an enhanced conditional guidance mechanism, collectively supporting high-quality few-step inference along arbitrary interpolation trajectories. Extensive experiments demonstrate that DiSI achieves state-of-the-art performance across diverse image restoration tasks while simultaneously offering efficient inference and flexible control over the distortion-perception trade-off.
📝 Abstract
Recent advances in Image Restoration (IR) have been largely driven by generative methods such as Diffusion Models and Flow Matching, which excel in synthesizing realistic textures while suffering from slow multi-step inference and compromised pixel fidelity. In contrast, classical regression-based IR methods excel precisely in these aspects, offering single-step efficiency and high pixel-level reconstruction fidelity. To bridge this gap, we propose DiSI, a unified framework that Disentangles the underlying Stochastic Interpolant process into independent generation and regression components. This decoupling endows DiSI with remarkable versatility, enabling a continuous and controllable transition from a pure regression process to a fully generative one. Technically, we instantiate this framework with two specific sampling trajectories, accompanied by a unified sampler for high-quality, few-step inference on arbitrary trajectories. Furthermore, we design a dual-branch U-Net style transformer network in pixel space, using a dedicated branch to enhance conditional guidance while ensuring high throughput. Extensive experiments demonstrate that DiSI efficiently achieves competitive results on various IR tasks, while uniquely offering the inference-time flexibility to control the distortion-perception trade-off within a single model.