Transition Models: Rethinking the Generative Learning Objective

๐Ÿ“… 2025-09-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Generative modeling faces a fundamental trade-off between sample quality and computational efficiency: iterative diffusion models achieve high fidelity at prohibitive computational cost, whereas few-step models are inherently constrained by an upper bound on achievable quality. This paper introduces Transition Models (TiM), the first framework to formulate generative dynamics via an analytically tractable finite-time state transition equation, modeling synthesis as an exact continuous-time dynamical system. TiM unifies single-step generation and multi-step refinement, guaranteeing strictly monotonic quality improvement with increasing sampling stepsโ€”thereby breaking the quality ceiling of conventional few-step approaches. Employing native high-resolution training, TiM achieves state-of-the-art performance across all step counts, surpassing SD3.5 (8B parameters) and FLUX.1 (12B parameters) despite using only 0.865B parameters, and delivers exceptional fidelity at 4096ร—4096 resolution.

Technology Category

Application Category

๐Ÿ“ Abstract
A fundamental dilemma in generative modeling persists: iterative diffusion models achieve outstanding fidelity, but at a significant computational cost, while efficient few-step alternatives are constrained by a hard quality ceiling. This conflict between generation steps and output quality arises from restrictive training objectives that focus exclusively on either infinitesimal dynamics (PF-ODEs) or direct endpoint prediction. We address this challenge by introducing an exact, continuous-time dynamics equation that analytically defines state transitions across any finite time interval. This leads to a novel generative paradigm, Transition Models (TiM), which adapt to arbitrary-step transitions, seamlessly traversing the generative trajectory from single leaps to fine-grained refinement with more steps. Despite having only 865M parameters, TiM achieves state-of-the-art performance, surpassing leading models such as SD3.5 (8B parameters) and FLUX.1 (12B parameters) across all evaluated step counts. Importantly, unlike previous few-step generators, TiM demonstrates monotonic quality improvement as the sampling budget increases. Additionally, when employing our native-resolution strategy, TiM delivers exceptional fidelity at resolutions up to 4096x4096.
Problem

Research questions and friction points this paper is trying to address.

Resolving trade-off between diffusion model fidelity and computational cost
Overcoming restrictive training objectives limiting few-step generation quality
Enabling continuous-time transitions for adaptive multi-step generative trajectories
Innovation

Methods, ideas, or system contributions that make the work stand out.

Exact continuous-time dynamics equation for transitions
Adapts to arbitrary-step transitions for generation
Native-resolution strategy for high-fidelity outputs
๐Ÿ”Ž Similar Papers
No similar papers found.