FlowS: One-Step Motion Prediction via Local Transport Conditioning

📅 2026-04-28

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the challenge of achieving high-accuracy, multimodal, and low-latency motion prediction for autonomous driving. While diffusion models offer strong performance, their multi-step denoising process impedes real-time inference. To overcome this limitation, the authors propose a single-step prediction framework grounded in local transport conditioning, which reframes long-horizon trajectory generation as localized refinements over multiple plausible futures. By integrating learnable scene-aware prior anchors with a semigroup self-consistent displacement field, and leveraging conditional flow matching with a stable training objective, the method approximates the accuracy of multi-step approaches using only a single Euler integration step. Evaluated on the Waymo Open Motion Dataset, the model achieves state-of-the-art performance with a Soft mAP of 0.4804 and mAP of 0.4703 at an inference speed of 75 FPS.

📝 Abstract

Generative motion prediction must satisfy three simultaneous requirements for real-world autonomy: high accuracy, diverse multimodal futures, and strictly bounded latency. Diffusion models meet the first two but violate the third, requiring tens to hundreds of denoising steps. We identify a conditioning strategy that resolves this tension: \textit{single-step integration is accurate when the underlying transport problem is local}. A model that must both discover the correct behavioral mode and traverse a long displacement in one step accumulates large discretization errors; conditioning the base distribution to lie near plausible futures reduces the problem to short-range refinement, the regime where a single Euler step suffices. We instantiate this \emph{local transport conditioning} in FlowS, a conditional flow matching framework with two mechanisms. First, an online, scene-conditioned learned prior emits $K$ calibrated anchor trajectories per agent, each already near a plausible future, converting mode discovery into local correction. Second, a step-consistent displacement field enforces semigroup self-consistency, guaranteeing that a single step inherits multi-step accuracy. Crucially, anchoring this field at learned priors along straight-line paths yields a {stable, low-variance} training target, unlike prior self-consistency methods that suffer from {high-variance bootstrap} signals on curved diffusion paths. On the Waymo Open Motion Dataset, FlowS achieves state-of-the-art Soft mAP {(0.4804) and mAP (0.4703) with ensemble at 75\,FPS} with single-step inference, demonstrating that local transport conditioning makes one-step generative motion prediction practical for safety-critical autonomy. Code and pretrained models will be released upon acceptance.

Problem

Research questions and friction points this paper is trying to address.

motion prediction

one-step generation

low latency

multimodal futures

autonomous driving

Innovation

Methods, ideas, or system contributions that make the work stand out.

local transport conditioning

one-step motion prediction

conditional flow matching