Statistical and Geometrical properties of regularized Kernel Kullback-Leibler divergence

📅 2024-08-29

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

203K/year

🤖 AI Summary

The original kernelized Kullback–Leibler (KL) divergence is ill-defined when the supports of compared distributions are disjoint—a fundamental limitation. To address this, the paper proposes a Tikhonov-regularized kernel KL divergence, constructed via covariance operator embeddings in a reproducing kernel Hilbert space (RKHS). This metric is well-defined for arbitrary probability distributions—including discrete, continuous, and mutually singular ones—and provides theoretical guarantees: a bias bound relative to the true KL divergence, finite-sample convergence rates, and a closed-form solution for discrete distributions. Furthermore, the authors formulate a Wasserstein gradient flow optimization framework for the proposed divergence, ensuring theoretical convergence, and design an efficient algorithm applicable to discrete structures such as point clouds. Experiments on point cloud transport tasks demonstrate that the method outperforms existing kernelized and Wasserstein-based approaches, achieving superior stability and robustness.

Technology Category

Application Category

📝 Abstract

In this paper, we study the statistical and geometrical properties of the Kullback-Leibler divergence with kernel covariance operators (KKL) introduced by Bach [2022]. Unlike the classical Kullback-Leibler (KL) divergence that involves density ratios, the KKL compares probability distributions through covariance operators (embeddings) in a reproducible kernel Hilbert space (RKHS), and compute the Kullback-Leibler quantum divergence. This novel divergence hence shares parallel but different aspects with both the standard Kullback-Leibler between probability distributions and kernel embeddings metrics such as the maximum mean discrepancy. A limitation faced with the original KKL divergence is its inability to be defined for distributions with disjoint supports. To solve this problem, we propose in this paper a regularised variant that guarantees that the divergence is well defined for all distributions. We derive bounds that quantify the deviation of the regularised KKL to the original one, as well as finite-sample bounds. In addition, we provide a closed-form expression for the regularised KKL, specifically applicable when the distributions consist of finite sets of points, which makes it implementable. Furthermore, we derive a Wasserstein gradient descent scheme of the KKL divergence in the case of discrete distributions, and study empirically its properties to transport a set of points to a target distribution.

Problem

Research questions and friction points this paper is trying to address.

Defines regularized KKL divergence for all distributions

Provides bounds for regularized KKL divergence deviation

Derives Wasserstein gradient descent for discrete distributions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Regularized KKL divergence for all distributions

Closed-form expression for finite point sets

Wasserstein gradient descent for discrete distributions

🔎 Similar Papers

No similar papers found.