🤖 AI Summary
This paper addresses the slow convergence rate of kernel mean embeddings (KMEs). We propose a variance-aware adaptive estimation framework that explicitly incorporates sample variance information—defined in a reproducing kernel Hilbert space (RKHS)—into KME construction, enabling accelerated convergence without prior knowledge of the data distribution or kernel. We design a data-driven variance estimator and extend theoretical analysis from i.i.d. samples to stationary α-mixing sequences, preserving dimension-free convergence rates. Theoretically, under favorable variance structures, our method achieves convergence faster than the standard $1/sqrt{n}$ rate. Empirically, it significantly improves statistical power in two-sample testing and estimation accuracy in robust parameter estimation, offering both rigorous theoretical guarantees and practical effectiveness.
📝 Abstract
An important feature of kernel mean embeddings (KME) is that the rate of convergence of the empirical KME to the true distribution KME can be bounded independently of the dimension of the space, properties of the distribution and smoothness features of the kernel. We show how to speed-up convergence by leveraging variance information in the reproducing kernel Hilbert space. Furthermore, we show that even when such information is a priori unknown, we can efficiently estimate it from the data, recovering the desiderata of a distribution agnostic bound that enjoys acceleration in fortuitous settings. We further extend our results from independent data to stationary mixing sequences and illustrate our methods in the context of hypothesis testing and robust parametric estimation.