🤖 AI Summary
This work addresses the ill-posedness of optimal control problems arising from traditional KL regularization, which diverges under support mismatch or in the low-noise limit. To resolve this, the authors propose a novel KL-type divergence grounded in an information-geometric framework, replacing the Fisher–Rao metric with Wasserstein and Kalman–Wasserstein transport metrics. The resulting divergence remains finite in the low-noise regime, thereby eliminating singularities and offering a geometric justification for heuristic regularization terms commonly used in Kalman ensemble methods. Through closed-form derivations for Gaussian distributions and analysis of linear time-invariant systems, the approach is validated on double-integrator and inverted pendulum systems, demonstrating both well-posedness and superior control performance.
📝 Abstract
Kullback-Leibler divergence (KL) regularization is widely used in reinforcement learning, but it becomes infinite under support mismatch and can degenerate in low-noise limits. Utilizing a unified information-geometric framework, we introduce (Kalman)-Wasserstein-based KL analogues by replacing the Fisher-Rao geometry in the dynamical formulation of the KL with transport-based geometries, and we derive closed-form values for common distribution families. These divergences remain finite under support mismatch and yield a geometric interpretation of regularization heuristics used in Kalman ensemble methods. We demonstrate the utility of these divergences in KL-regularized optimal control. In the fully tractable setting of linear time-invariant systems with Gaussian process noise, the classical KL reduces to a quadratic control penalty that becomes singular as process noise vanishes. Our variants remove this singularity, yielding well-posed problems. On a double integrator and a cart-pole example, the resulting controls outperform KL-based regularization.