KoopmanFlow: Spectrally Decoupled Generative Control Policy via Koopman Structural Bias

📅 2026-03-14

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work addresses the challenge that existing generative control policies struggle to simultaneously maintain stable global motion and perform high-frequency local corrections, as unified time integration often smooths out transient details. To overcome this, the authors propose a spectrally decoupled generative control architecture that incorporates a Koopman structural prior within a unified multimodal latent space. The macro branch models slowly varying trajectories via single-step consistency training, while the transient branch captures high-frequency residuals induced by visual discontinuities—such as contacts or occlusions—using flow matching. An asymmetric consistency objective enables joint modeling of low- and high-frequency dynamics. This approach avoids error accumulation across multiple stages and significantly outperforms current methods in contact-rich, disturbance-sensitive tasks, achieving both high control accuracy and parameter efficiency under real-time deployment constraints.

Technology Category

Application Category

📝 Abstract

Generative Control Policies (GCPs) show immense promise in robotic manipulation but struggle to simultaneously model stable global motions and high-frequency local corrections. While modern architectures extract multi-scale spatial features, their underlying Probability Flow ODEs apply a uniform temporal integration schedule. Compressed to a single step for real-time Receding Horizon Control (RHC), uniform ODE solvers mathematically smooth over sparse, high-frequency transients entangled within low-frequency steady states. To decouple these dynamics without accumulating pipelined errors, we introduce KoopmanFlow, a parameter-efficient generative policy guided by a Koopman-inspired structural inductive bias. Operating in a unified multimodal latent space with visual context, KoopmanFlow bifurcates generation at the terminal stage. Because visual conditioning occurs before spectral decomposition, both branches are visually guided yet temporally specialized. A macroscopic branch anchors slow-varying trajectories via single-step Consistency Training, while a transient branch uses Flow Matching to isolate high-frequency residuals stimulated by sudden visual cues (e.g., contacts or occlusions). Guided by an explicit spectral prior and optimized via a novel asymmetric consistency objective, KoopmanFlow establishes a fused co-training mechanism. This allows the variant branch to absorb localized dynamics without multi-stage error accumulation. Extensive experiments show KoopmanFlow significantly outperforms state-of-the-art baselines in contact-rich tasks requiring agile disturbance rejection. By trading a surplus latency buffer for a richer structural prior, KoopmanFlow achieves superior control fidelity and parameter efficiency within real-time deployment limits.

Problem

Research questions and friction points this paper is trying to address.

Generative Control Policy

High-frequency Transients

Temporal Integration

Real-time Control

Multiscale Dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Koopman operator

spectral decoupling

generative control policy