๐ค AI Summary
To address the failure of parameter estimation in exponential family models under contaminated data, this paper proposes a robust score matching method. It introduces the Geometric Median Mean (GMM) into the score matching framework for the first timeโbypassing normalization constant computation while preserving the inherent convexity of exponential families. The method is applicable to non-Gaussian exponential family graphical models and provides theoretical recovery guarantees under data contamination. Experiments demonstrate that it significantly outperforms standard score matching on contaminated data, while maintaining comparable performance on clean data. Its effectiveness is further validated on real-world precipitation data. The core contribution lies in the principled integration of GMM with score matching, yielding a novel parameter estimation paradigm that is normalization-free, robust to outliers, theoretically justified, and broadly applicable across exponential family models.
๐ Abstract
Proposed in Hyv""arinen (2005), score matching is a parameter estimation procedure that does not require computation of distributional normalizing constants. In this work we utilize the geometric median of means to develop a robust score matching procedure that yields consistent parameter estimates in settings where the observed data has been contaminated. A special appeal of the proposed method is that it retains convexity in exponential family models. The new method is therefore particularly attractive for non-Gaussian, exponential family graphical models where evaluation of normalizing constants is intractable. Support recovery guarantees for such models when contamination is present are provided. Additionally, support recovery is studied in numerical experiments and on a precipitation dataset. We demonstrate that the proposed robust score matching estimator performs comparably to the standard score matching estimator when no contamination is present but greatly outperforms this estimator in a setting with contamination.