Designing Instance-Level Sampling Schedules via REINFORCE with James-Stein Shrinkage

📅 2025-11-27

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

This work addresses poor text–image alignment and limited generation quality during sampling in frozen text-to-image diffusion models (e.g., Stable Diffusion, Flux). We propose an instance-level sampling scheduling method that requires no model weight fine-tuning. Our approach introduces a dynamic schedule conditioned on both the text prompt and noise level, modeled via a single-forward Dirichlet policy network. To stabilize policy optimization, we adopt the James–Stein estimator as a baseline in the REINFORCE algorithm—significantly reducing variance in high-dimensional policy gradient estimation—enabling model-agnostic, post-training optimization. Experiments demonstrate that our method elevates Flux-Dev’s generation quality to near that of Flux-Schnell using only five sampling steps, while markedly improving text alignment, character rendering fidelity, and control over complex compositional structures.

Technology Category

Application Category

📝 Abstract

Most post-training methods for text-to-image samplers focus on model weights: either fine-tuning the backbone for alignment or distilling it for few-step efficiency. We take a different route: rescheduling the sampling timeline of a frozen sampler. Instead of a fixed, global schedule, we learn instance-level (prompt- and noise-conditioned) schedules through a single-pass Dirichlet policy. To ensure accurate gradient estimates in high-dimensional policy learning, we introduce a novel reward baseline based on a principled James-Stein estimator; it provably achieves lower estimation errors than commonly used variants and leads to superior performance. Our rescheduled samplers consistently improve text-image alignment including text rendering and compositional control across modern Stable Diffusion and Flux model families. Additionally, a 5-step Flux-Dev sampler with our schedules can attain generation quality comparable to deliberately distilled samplers like Flux-Schnell. We thus position our scheduling framework as an emerging model-agnostic post-training lever that unlocks additional generative potential in pretrained samplers.

Problem

Research questions and friction points this paper is trying to address.

Learns instance-level sampling schedules for frozen text-to-image samplers

Introduces a James-Stein estimator baseline to reduce gradient estimation errors

Improves text-image alignment and generation quality without model fine-tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learns instance-level sampling schedules via Dirichlet policy

Introduces James-Stein estimator for accurate gradient estimation

Improves alignment in frozen samplers without fine-tuning model weights

🔎 Similar Papers

An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning