🤖 AI Summary
This study addresses the challenge of statistical inference in the presence of heteroscedastic measurement errors, where conventional methods often fail due to either neglecting the noise structure or incurring prohibitive computational costs. The authors propose a convolutional Maximum Mean Discrepancy (convMMD) framework, which extends MMD to noisy settings for the first time by convolving observed samples with the known noise distribution, thereby enabling robust nonparametric testing and estimation. Theoretical contributions include finite-sample bias bounds, an equivalence between noise-convolved MMD tests and kernel smoothing, and proofs of consistency and asymptotic normality of the resulting estimators. Empirical evaluations demonstrate that the method achieves both computational efficiency and practical utility across simulations and real-world applications in astronomy and social sciences.
📝 Abstract
Modern data analyses frequently encounter settings where samples of variables are contaminated by measurement error. Ignoring measurement noise can substantially degrade statistical inference, while existing correction techniques are often computationally costly and inefficient. Recent advances in kernel methods, particularly those based on Maximum Mean Discrepancy (MMD), have enabled flexible, distribution-free inference, yet typically assume precise data and overlook contamination by measurement error. In this work, we introduce a novel framework for inference with samples corrupted by potentially heteroscedastic noise from a known distribution. Central to our approach is the convolutional MMD (convMMD), which compares distributions after noise convolution and retains metric validity under standard kernel conditions. We establish finite-sample deviation bounds that are unaffected by measurement error and prove an equivalence between testing under noise and kernel smoothing. Leveraging these insights, we introduce a convMMD-based estimator for inference with noisy, heteroscedastic observations. We establish its consistency and asymptotic normality, and provide an efficient implementation using stochastic gradient descent. We demonstrate the practical effectiveness of our approach through simulations and applications in astronomy and social sciences.