Unified Unbiased Variance Estimation for Maximum Mean Discrepancy: Robust Finite-Sample Performance with Imbalanced Data and Exact Acceleration under Null and Alternative Hypotheses

📅 2026-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inconsistent performance of existing variance estimators for the Maximum Mean Discrepancy (MMD) two-sample test under varying conditions—specifically, across the null and alternative hypotheses as well as balanced and unbalanced sample settings—and the absence of a unified framework. By leveraging the U-statistic representation and Hoeffding decomposition, the authors establish the first unified, unbiased variance estimation framework for MMD that encompasses all such hypothesis and sampling configurations. Furthermore, for the one-dimensional Laplacian kernel, they develop an exact accelerated algorithm that reduces computational complexity from O(n²) to O(n log n). The proposed method demonstrates robustness in finite samples, significantly enhancing both statistical inference accuracy and computational efficiency.

Technology Category

Application Category

📝 Abstract
The maximum mean discrepancy (MMD) is a kernel-based nonparametric statistic for two-sample testing, whose inferential accuracy depends critically on variance characterization. Existing work provides various finite-sample estimators of the MMD variance, often differing under the null and alternative hypotheses and across balanced or imbalanced sampling schemes. In this paper, we study the variance of the MMD statistic through its U-statistic representation and Hoeffding decomposition, and establish a unified finite-sample characterization covering different hypotheses and sample configurations. Building on this analysis, we propose an exact acceleration method for the univariate case under the Laplacian kernel, which reduces the overall computational complexity from $\mathcal O(n^2)$ to $\mathcal O(n \log n)$.
Problem

Research questions and friction points this paper is trying to address.

Maximum Mean Discrepancy
Variance Estimation
Two-sample Testing
Imbalanced Data
U-statistic
Innovation

Methods, ideas, or system contributions that make the work stand out.

MMD variance estimation
U-statistic
Hoeffding decomposition
computational acceleration
imbalanced data
🔎 Similar Papers
No similar papers found.
Shijie Zhong
Shijie Zhong
Department of Physics, University of Colorado at Boulder
GeophysicsPlanetary Sciences
J
Jiangfeng Fu
School of Power and Energy, Northwestern Polytechnical University
Y
Yikun Yang
School of Power and Energy, Northwestern Polytechnical University