Learning Deep Kernels for Non-Parametric Independence Testing

📅 2024-09-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional kernel-based independence tests suffer from low statistical power and poor small-sample performance due to hand-crafted kernel design, especially in detecting high-dimensional nonlinear dependencies. Method: We propose a deep kernel learning framework grounded in asymptotic test power maximization. It parameterizes the kernel function via a differentiable deep neural network and performs end-to-end optimization using the Hilbert–Schmidt Independence Criterion (HSIC) statistic with an unbiased U-statistic estimator. Theoretical analysis guarantees convergence to the optimal kernel that maximizes the true asymptotic test power. Results: Extensive experiments on synthetic and real-world datasets demonstrate substantial improvements in independence testing power: average detection accuracy increases by 23% over Gaussian and distance covariance kernels. The method effectively captures complex dependency structures—including nonlinear, higher-order, and multi-scale relationships—while maintaining rigorous statistical foundations.

Technology Category

Application Category

📝 Abstract
The Hilbert-Schmidt Independence Criterion (HSIC) is a powerful tool for nonparametric detection of dependence between random variables. It crucially depends, however, on the selection of reasonable kernels; commonly-used choices like the Gaussian kernel, or the kernel that yields the distance covariance, are sufficient only for amply sized samples from data distributions with relatively simple forms of dependence. We propose a scheme for selecting the kernels used in an HSIC-based independence test, based on maximizing an estimate of the asymptotic test power. We prove that maximizing this estimate indeed approximately maximizes the true power of the test, and demonstrate that our learned kernels can identify forms of structured dependence between random variables in various experiments.
Problem

Research questions and friction points this paper is trying to address.

Detect subtle dependencies in high-dimensional complex variables
Construct finite-sample valid tests using variational mutual information
Optimize representation to maximize test power, not just statistic
Innovation

Methods, ideas, or system contributions that make the work stand out.

Variational estimators for mutual information tests
Link between variational MI and HSIC kernel learning
Optimized HSIC tests with deep kernels
🔎 Similar Papers
N
Nathaniel Xu
University of British Columbia
F
Feng Liu
University of Melbourne
Danica J. Sutherland
Danica J. Sutherland
University of British Columbia + Amii
Machine Learning