Relaxed Triangle Inequality for Kullback-Leibler Divergence Between Multivariate Gaussian Distributions

πŸ“… 2026-01-31
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the theoretical limitations of Kullback–Leibler (KL) divergence in tasks requiring metric properties, as it violates the triangle inequality. Focusing on multivariate Gaussian distributions, the paper establishes the first tight relaxed triangle inequality for KL divergence: under the constraints KL(𝒩₁, 𝒩₂) ≀ Ρ₁ and KL(𝒩₂, 𝒩₃) ≀ Ξ΅β‚‚, it derives the sharp upper bound of KL(𝒩₁, 𝒩₃) together with the conditions under which this bound is attainable. The asymptotic form of this bound under small perturbations is shown to be Ρ₁ + Ξ΅β‚‚ + √(Ρ₁Ρ₂) + o(Ρ₁) + o(Ξ΅β‚‚). This result leverages the analytical structure of Gaussian KL divergence through tools from information geometry and optimization theory. The derived bound has been successfully applied to out-of-distribution detection in flow-based generative models and safe reinforcement learning, significantly enhancing both theoretical rigor and empirical performance.

Technology Category

Application Category

πŸ“ Abstract
The Kullback-Leibler (KL) divergence is not a proper distance metric and does not satisfy the triangle inequality, posing theoretical challenges in certain practical applications. Existing work has demonstrated that KL divergence between multivariate Gaussian distributions follows a relaxed triangle inequality. Given any three multivariate Gaussian distributions $\mathcal{N}_1, \mathcal{N}_2$, and $\mathcal{N}_3$, if $KL(\mathcal{N}_1, \mathcal{N}_2)\leq \epsilon_1$ and $KL(\mathcal{N}_2, \mathcal{N}_3)\leq \epsilon_2$, then $KL(\mathcal{N}_1, \mathcal{N}_3)<3\epsilon_1+3\epsilon_2+2\sqrt{\epsilon_1\epsilon_2}+o(\epsilon_1)+o(\epsilon_2)$. However, the supremum of $KL(\mathcal{N}_1, \mathcal{N}_3)$ is still unknown. In this paper, we investigate the relaxed triangle inequality for the KL divergence between multivariate Gaussian distributions and give the supremum of $KL(\mathcal{N}_1, \mathcal{N}_3)$ as well as the conditions when the supremum can be attained. When $\epsilon_1$ and $\epsilon_2$ are small, the supremum is $\epsilon_1+\epsilon_2+\sqrt{\epsilon_1\epsilon_2}+o(\epsilon_1)+o(\epsilon_2)$. Finally, we demonstrate several applications of our results in out-of-distribution detection with flow-based generative models and safe reinforcement learning.
Problem

Research questions and friction points this paper is trying to address.

Kullback-Leibler divergence
triangle inequality
multivariate Gaussian distributions
supremum
relaxed triangle inequality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Kullback-Leibler divergence
relaxed triangle inequality
multivariate Gaussian distributions
supremum
out-of-distribution detection
πŸ”Ž Similar Papers
No similar papers found.
S
Shiji Xiao
College of Computer Science and Electronic Engineering, Hunan University
Yufeng Zhang
Yufeng Zhang
PhD. student of Computer Science, University of Defense Technology
symbolic executionsoftware model checking
C
Chubo Liu
College of Computer Science and Electronic Engineering, Hunan University
Y
Yan Ding
College of Computer Science and Electronic Engineering, Hunan University
Keqin Li
Keqin Li
AMA University
RoboticMachine learningArtificial intelligenceComputer vision
Kenli Li
Kenli Li
Cheung Kong Professor, Hunan University
High-performance ComputingParallel and Distributed ProcessingAI and Big Data