🤖 AI Summary
Diffusion models for protein and DNA design suffer from limited functional optimization due to their reliance on single-step denoising sampling, hindering effective reward maximization for downstream biological functions. To address this, we propose a reward-guided iterative refinement framework that reformulates test-time optimization as an evolutionary, alternating process of controlled noise injection and reward-driven denoising. Crucially, we provide the first theoretical convergence guarantee for this paradigm. Technically, our approach integrates diffusion modeling with reinforcement learning–inspired reward shaping, adaptive noise scheduling, and gradient-free optimization via score-based gradient approximation. Evaluated on protein structure design and cell-type-specific DNA sequence generation, our method significantly improves functional fidelity and biophysical feasibility, consistently outperforming state-of-the-art single-step sampling and gradient-based optimization methods.
📝 Abstract
To fully leverage the capabilities of diffusion models, we are often interested in optimizing downstream reward functions during inference. While numerous algorithms for reward-guided generation have been recently proposed due to their significance, current approaches predominantly focus on single-shot generation, transitioning from fully noised to denoised states. We propose a novel framework for inference-time reward optimization with diffusion models inspired by evolutionary algorithms. Our approach employs an iterative refinement process consisting of two steps in each iteration: noising and reward-guided denoising. This sequential refinement allows for the gradual correction of errors introduced during reward optimization. Besides, we provide a theoretical guarantee for our framework. Finally, we demonstrate its superior empirical performance in protein and cell-type-specific regulatory DNA design. The code is available at href{https://github.com/masa-ue/ProDifEvo-Refinement}{https://github.com/masa-ue/ProDifEvo-Refinement}.