๐ค AI Summary
This work proposes the GERVE framework to directly approximate Gibbs measures from samples in scenarios where the number of modes is unknown and full density estimation is infeasible. By combining an entropy-regularized variational objective with natural gradient updates, GERVE unifies kernel density estimation, mean shift, Gaussian mixture modeling, and annealing strategies to adaptively fit high-density regions and assign cluster memberships. Theoretically, the method provides convergence guarantees for mode recovery; practically, it incorporates bootstrap-based confidence ellipses and stability scores. Experiments on both synthetic and real-world data demonstrate that GERVE accurately recovers modal structures with high precision and exhibits robustness to over-parameterization of mixture components, substantially reducing reliance on pre-specified numbers of clusters.
๐ Abstract
We approach multivariate mode estimation through Gibbs distributions and introduce GERVE (Gibbs-measure Entropy-Regularised Variational Estimation), a likelihood-free framework that approximates Gibbs measures directly from samples by maximizing an entropy-regularised variational objective with natural-gradient updates. GERVE brings together kernel density estimation, mean-shift, variational inference, and annealing in a single platform for mode estimation. It fits Gaussian mixtures that concentrate on high-density regions and yields cluster assignments from responsibilities, with reduced sensitivity to the chosen number of components. We provide theory in two regimes: as the Gibbs temperature approaches zero, mixture components converge to population modes; at fixed temperature, maximisers of the empirical objective exist, are consistent, and are asymptotically normal. We also propose a bootstrap procedure for per-mode confidence ellipses and stability scores. Simulation and real-data studies show accurate mode recovery and emergent clustering, robust to mixture overspecification. GERVE is a practical likelihood-free approach when the number of modes or groups is unknown and full density estimation is impractical.