Variance-Aware Estimation of Kernel Mean Embedding

📅 2022-10-13

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 1

career value

207K/year

🤖 AI Summary

This paper addresses the slow convergence rate of kernel mean embeddings (KMEs). We propose a variance-aware adaptive estimation framework that explicitly incorporates sample variance information—defined in a reproducing kernel Hilbert space (RKHS)—into KME construction, enabling accelerated convergence without prior knowledge of the data distribution or kernel. We design a data-driven variance estimator and extend theoretical analysis from i.i.d. samples to stationary α-mixing sequences, preserving dimension-free convergence rates. Theoretically, under favorable variance structures, our method achieves convergence faster than the standard $1/sqrt{n}$ rate. Empirically, it significantly improves statistical power in two-sample testing and estimation accuracy in robust parameter estimation, offering both rigorous theoretical guarantees and practical effectiveness.

📝 Abstract

An important feature of kernel mean embeddings (KME) is that the rate of convergence of the empirical KME to the true distribution KME can be bounded independently of the dimension of the space, properties of the distribution and smoothness features of the kernel. We show how to speed-up convergence by leveraging variance information in the reproducing kernel Hilbert space. Furthermore, we show that even when such information is a priori unknown, we can efficiently estimate it from the data, recovering the desiderata of a distribution agnostic bound that enjoys acceleration in fortuitous settings. We further extend our results from independent data to stationary mixing sequences and illustrate our methods in the context of hypothesis testing and robust parametric estimation.

Problem

Research questions and friction points this paper is trying to address.

Improve convergence speed of kernel mean embeddings

Estimate variance information from data efficiently

Extend results to stationary mixing sequences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging variance in RKHS for faster convergence

Estimating variance from data when unknown

Extending results to stationary mixing sequences

🔎 Similar Papers

Towards One Model for Classical Dimensionality Reduction: A Probabilistic Perspective on UMAP and t-SNE