Normalizing Trajectory Models

πŸ“… 2026-05-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

193K/year
πŸ€– AI Summary
This work addresses the challenge in few-step diffusion generation where the Gaussian denoising assumption breaks down and existing methods struggle to simultaneously achieve high sample quality and accurate likelihood estimation. The authors propose a novel framework based on conditional normalizing flows, employing shallow invertible blocks and a cross-trajectory deep parallel predictor to model the full reverse trajectory while preserving exact likelihood during training. This approach enables, for the first time, full-trajectory likelihood optimization under few-step generation. A self-distillation mechanism is introduced, leveraging the model’s own score signals to train a lightweight denoiser. The method supports training from scratch or initialization from a pretrained flow-matching model, and achieves state-of-the-art or competitive performance on text-to-image synthesis with only four sampling steps, maintaining both high visual quality and computable likelihood.
πŸ“ Abstract
Diffusion-based models decompose sampling into many small Gaussian denoising steps -- an assumption that breaks down when generation is compressed to a few coarse transitions. Existing few-step methods address this through distillation, consistency training, or adversarial objectives, but sacrifice the likelihood framework in the process. We introduce Normalizing Trajectory Models (NTM), which models each reverse step as an expressive conditional normalizing flow with exact likelihood training. Architecturally, NTM combines shallow invertible blocks within each step with a deep parallel predictor across the trajectory, forming an end-to-end network trainable from scratch or initializable from pretrained flow-matching models. Its exact trajectory likelihood further enables self-distillation: a lightweight denoiser trained on the model's own score produces high-quality samples in four steps. On text-to-image benchmarks, NTM matches or outperforms strong image generation baselines in just four sampling steps while uniquely retaining exact likelihood over the generative trajectory.
Problem

Research questions and friction points this paper is trying to address.

diffusion models
few-step generation
likelihood framework
trajectory modeling
normalizing flows
Innovation

Methods, ideas, or system contributions that make the work stand out.

Normalizing Flow
Diffusion Models
Few-step Generation
Exact Likelihood
Self-distillation
πŸ”Ž Similar Papers