🤖 AI Summary
To address the low statistical power and poor robustness of conventional two-sample tests in small-sample, high-dimensional settings, this paper proposes MMD-FUSE—a novel Maximum Mean Discrepancy (MMD) testing framework that synergistically integrates classical and quantum kernels. By jointly modeling the inductive biases and representational capacities of both kernel types, MMD-FUSE enables adaptive, high-sensitivity detection of distributional discrepancies. Theoretically, embedding quantum kernels into the MMD statistic ensures test consistency. Algorithmically, a data-driven kernel combination strategy eliminates manual hyperparameter tuning. Extensive experiments on multiple small-sample, high-dimensional benchmark datasets and real-world clinical data demonstrate that MMD-FUSE significantly improves test power (average gain of +12.7%) while maintaining strong stability and cross-domain generalizability. This work establishes a new paradigm for statistical inference under resource-constrained conditions.
📝 Abstract
Two-sample tests have been extensively employed in various scientific fields and machine learning such as evaluation on the effectiveness of drugs and A/B testing on different marketing strategies to discriminate whether two sets of samples come from the same distribution or not. Kernel-based procedures for hypothetical testing have been proposed to efficiently disentangle high-dimensional complex structures in data to obtain accurate results in a model-free way by embedding the data into the reproducing kernel Hilbert space (RKHS). While the choice of kernels plays a crucial role for their performance, little is understood about how to choose kernel especially for small datasets. Here we aim to construct a hypothetical test which is effective even for small datasets, based on the theoretical foundation of kernel-based tests using maximum mean discrepancy, which is called MMD-FUSE. To address this, we enhance the MMD-FUSE framework by incorporating quantum kernels and propose a novel hybrid testing strategy that fuses classical and quantum kernels. This approach creates a powerful and adaptive test by combining the domain-specific inductive biases of classical kernels with the unique expressive power of quantum kernels. We evaluate our method on various synthetic and real-world clinical datasets, and our experiments reveal two key findings: 1) With appropriate hyperparameter tuning, MMD-FUSE with quantum kernels consistently improves test power over classical counterparts, especially for small and high-dimensional data. 2) The proposed hybrid framework demonstrates remarkable robustness, adapting to different data characteristics and achieving high test power across diverse scenarios. These results highlight the potential of quantum-inspired and hybrid kernel strategies to build more effective statistical tests, offering a versatile tool for data analysis where sample sizes are limited.