🤖 AI Summary
This work addresses the inference latency and system backlog in sequential prediction under streaming observations, which arise from repeatedly sampling from uninformative initial distributions. To overcome this, the authors propose a recursive generative framework that integrates Bayesian filtering with flow matching. The approach models sequence generation as a probabilistic flow transport process across time steps, leveraging the posterior distribution from the previous time step as a warm-start initialization for the next. This enables, for the first time, posterior-guided few-step sampling that achieves performance comparable to full diffusion models with only one to a few sampling steps. The method significantly enhances inference efficiency and breaks through the computational bottleneck inherent in conventional multi-step diffusion processes, as demonstrated across diverse tasks including prediction, decision-making, and state estimation.
📝 Abstract
Sequential prediction from streaming observations is a fundamental problem in stochastic dynamical systems, where inherent uncertainty often leads to multiple plausible futures. While diffusion and flow-matching models are capable of modeling complex, multi-modal trajectories, their deployment in real-time streaming environments typically relies on repeated sampling from a non-informative initial distribution, incurring substantial inference latency and potential system backlogs. In this work, we introduce Sequential Flow Matching, a principled framework grounded in Bayesian filtering. By treating streaming inference as learning a probability flow that transports the predictive distribution from one time step to the next, our approach naturally aligns with the recursive structure of Bayesian belief updates. We provide theoretical justification that initializing generation from the previous posterior offers a principled warm start that can accelerate sampling compared to na\"ive re-sampling. Across a wide range of forecasting, decision-making and state estimation tasks, our method achieves performance competitive with full-step diffusion while requiring only one or very few sampling steps, therefore with faster sampling. It suggests that framing sequential inference via Bayesian filtering provides a new and principled perspective towards efficient real-time deployment of flow-based models.