Kernel Quantile Embeddings and Associated Probability Metrics

📅 2025-05-26

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

Existing Maximum Mean Discrepancy (MMD) methods rely on kernel mean embeddings, suffering from restrictive kernel assumptions, high computational cost, and insufficient theoretical guarantees. Method: This paper introduces Kernelized Quantile Embeddings (KQE), the first framework to generalize the notion of quantiles to reproducing kernel Hilbert spaces (RKHS). KQE defines a novel family of probability divergences that bypass dependence on means, satisfy metric axioms for probability measures, and admit rigorous theoretical foundations under milder kernel conditions. It naturally recovers a kernelized sliced Wasserstein distance. Efficient nonparametric estimation and near-linear-time approximation algorithms are developed. Results: In two-sample hypothesis testing, KQE matches or exceeds the statistical power of MMD and its fast variants, while providing stronger theoretical guarantees—including consistency and distribution-free asymptotics—and significantly lower computational complexity.

Technology Category

Application Category

📝 Abstract

Embedding probability distributions into reproducing kernel Hilbert spaces (RKHS) has enabled powerful nonparametric methods such as the maximum mean discrepancy (MMD), a statistical distance with strong theoretical and computational properties. At its core, the MMD relies on kernel mean embeddings to represent distributions as mean functions in RKHS. However, it remains unclear if the mean function is the only meaningful RKHS representation. Inspired by generalised quantiles, we introduce the notion of kernel quantile embeddings (KQEs). We then use KQEs to construct a family of distances that: (i) are probability metrics under weaker kernel conditions than MMD; (ii) recover a kernelised form of the sliced Wasserstein distance; and (iii) can be efficiently estimated with near-linear cost. Through hypothesis testing, we show that these distances offer a competitive alternative to MMD and its fast approximations.

Problem

Research questions and friction points this paper is trying to address.

Extends kernel mean embeddings to quantile representations

Proposes new probability metrics with weaker kernel conditions

Enables efficient estimation with near-linear computational cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces kernel quantile embeddings (KQEs)

Constructs distances with weaker kernel conditions

Efficiently estimates with near-linear cost

🔎 Similar Papers

Beyond Calibration: Assessing the Probabilistic Fit of Neural Regressors via Conditional Congruence