🤖 AI Summary
This work addresses the challenge of clustering infinite-dimensional stochastic objects—such as dynamic functions or random graphs—in Hilbert spaces, where conventional probability densities are ill-defined. The authors propose the first Gaussian mixture model in Hilbert space based on kernel mean embeddings, rigorously establishing its well-posedness and demonstrating its ability to densely approximate distributions in infinite-dimensional settings. An efficient optimization algorithm is developed for parameter estimation, integrating kernel methods, Hilbert space theory, and numerical optimization. The resulting framework naturally accommodates complex data structures including \(L^2\) functions and Laplacian graphs, and achieves superior clustering performance across diverse real-world datasets, particularly in medical applications.
📝 Abstract
Modern datasets across many disciplines increasingly consist of time-evolving, potentially infinite-dimensional random objects, such as dynamic functional data, which are naturally modeled in Hilbert spaces. In these settings, characterizing probability measures, for example, through densities, can be ill-defined or technically challenging. Motivated by clustering applications, we propose a Gaussian mixture framework for Hilbert-space-valued data based on kernel mean embeddings and develop efficient optimization algorithms for estimation. We establish theoretical guarantees showing that the proposed algorithm is well defined and that the model yields a dense class of approximations in infinite-dimensional spaces. We evaluate the framework through extensive experiments on diverse structures and data geometries, including $L^2$-functional data and random graphs in Laplacian spaces arising in modern medical applications.