StreamFlow: Theory, Algorithm, and Implementation for High-Efficiency Rectified Flow Generation

📅 2025-11-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Rectified Flow (RF) models exhibit fundamental theoretical and architectural differences from conventional diffusion models, rendering existing acceleration techniques incompatible. To address this, we propose the first end-to-end efficient acceleration framework specifically designed for RF. Our method introduces three core innovations: (1) batched velocity field modeling, which decouples temporal dependencies to enable parallel computation; (2) heterogeneous timestep vectorized scheduling, optimizing hardware utilization; and (3) dynamic TensorRT compilation, achieving operator-level optimization and memory-access co-design. By tightly integrating flow-matching theory with system-level optimizations, our framework achieves up to 611% speedup on 512×512 image generation—significantly surpassing the current average acceleration of 18% across general-purpose methods—and enables, for the first time, efficient high-resolution deployment of RF models.

Technology Category

Application Category

📝 Abstract
New technologies such as Rectified Flow and Flow Matching have significantly improved the performance of generative models in the past two years, especially in terms of control accuracy, generation quality, and generation efficiency. However, due to some differences in its theory, design, and existing diffusion models, the existing acceleration methods cannot be directly applied to the Rectified Flow model. In this article, we have comprehensively implemented an overall acceleration pipeline from the aspects of theory, design, and reasoning strategies. This pipeline uses new methods such as batch processing with a new velocity field, vectorization of heterogeneous time-step batch processing, and dynamic TensorRT compilation for the new methods to comprehensively accelerate related models based on flow models. Currently, the existing public methods usually achieve an acceleration of 18%, while experiments have proved that our new method can accelerate the 512*512 image generation speed to up to 611%, which is far beyond the current non-generalized acceleration methods.
Problem

Research questions and friction points this paper is trying to address.

Accelerates Rectified Flow models for faster image generation
Overcomes limitations of existing diffusion model acceleration methods
Enhances efficiency in control, quality, and speed of generative models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Batch processing with new velocity field
Vectorization of heterogeneous time-step batch processing
Dynamic TensorRT compilation for flow models
🔎 Similar Papers
No similar papers found.