Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Imitation learning for autonomous driving is hindered by the difficulty of modeling complex interactive behaviors and the scarcity of high-quality interaction data. Method: We propose Flow Planner, a novel planning framework comprising (i) a trajectory segmentation tokenization scheme for fine-grained motion representation; (ii) a spatiotemporal fusion Transformer architecture that explicitly models multi-agent dynamic interactions; and (iii) a classifier-free guided flow matching generative framework enabling efficient, stable, and multimodal interaction-aware planning. Flow Planner dynamically modulates inter-agent interaction weights to enhance planning consistency and scene adaptability. Results: Flow Planner achieves state-of-the-art performance among learning-based methods on nuPlan and the high-interaction-density interPlan benchmark, demonstrating significant improvements in high-value interactive scenarios—particularly lane-change games and unmarked intersection navigation.

Technology Category

Application Category

📝 Abstract
Modeling interactive driving behaviors in complex scenarios remains a fundamental challenge for autonomous driving planning. Learning-based approaches attempt to address this challenge with advanced generative models, removing the dependency on over-engineered architectures for representation fusion. However, brute-force implementation by simply stacking transformer blocks lacks a dedicated mechanism for modeling interactive behaviors that are common in real driving scenarios. The scarcity of interactive driving data further exacerbates this problem, leaving conventional imitation learning methods ill-equipped to capture high-value interactive behaviors. We propose Flow Planner, which tackles these problems through coordinated innovations in data modeling, model architecture, and learning scheme. Specifically, we first introduce fine-grained trajectory tokenization, which decomposes the trajectory into overlapping segments to decrease the complexity of whole trajectory modeling. With a sophisticatedly designed architecture, we achieve efficient temporal and spatial fusion of planning and scene information, to better capture interactive behaviors. In addition, the framework incorporates flow matching with classifier-free guidance for multi-modal behavior generation, which dynamically reweights agent interactions during inference to maintain coherent response strategies, providing a critical boost for interactive scenario understanding. Experimental results on the large-scale nuPlan dataset and challenging interactive interPlan dataset demonstrate that Flow Planner achieves state-of-the-art performance among learning-based approaches while effectively modeling interactive behaviors in complex driving scenarios.
Problem

Research questions and friction points this paper is trying to address.

Modeling interactive driving behaviors in complex scenarios
Addressing scarcity of interactive driving data for learning
Improving multi-modal behavior generation with dynamic interactions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flow matching with classifier-free guidance for generation
Fine-grained trajectory tokenization to reduce complexity
Efficient spatiotemporal fusion architecture for interaction modeling
🔎 Similar Papers