🤖 AI Summary
This work addresses the limitations of traditional prototype analysis, which relies on linear geometry and struggles to model strongly nonlinear structured data, while existing neural approaches often compromise geometric interpretability. The authors propose a Riemannian prototype analysis framework grounded in a data-driven pullback metric. By introducing deformed star-shaped distributions and their induced pullback Riemannian geometry, prototypes are defined as projections onto a manifold of geodesically convex combinations. The method employs an optimization strategy combining convex relaxation with non-convex refinement. Evaluated on synthetic data and MNIST, it achieves interpretable geodesic interpolation, effective denoising, and geometry-aware classification, while also exposing inherent limitations of current optimization paradigms—thus successfully balancing nonlinear modeling capacity with geometric interpretability.
📝 Abstract
Classical archetypal analysis is appealing for its interpretability, but its linear geometry can limit performance on data with strongly non-linear structure; at the same time, existing neural extensions improve flexibility while often weakening the geometric meaning of archetypes and interpolations. In this work, we develop a Riemannian version of archetypal analysis based on data-driven pullback geometry for real-valued data, with the goal of combining the interpretability of classical archetypal analysis with the expressive power of modern non-linear models. We introduce a class of deformed star distributions together with associated pullback Riemannian geometry to provide a statistical interpretation of the resulting manifold mappings, define the Riemannian archetypal mapping (RAM) as a projection onto the manifold of geodesically convex combinations of archetypes, and propose a practical optimization scheme based on convex relaxation followed by non-convex refinement. We further propose a learning scheme that yields reasonable, albeit generally suboptimal, deformed star distributions from data. Experiments on synthetic examples and MNIST show that the resulting framework produces meaningful geodesics, useful denoising projections, and geometry-aware classifications, while also clarifying where current optimization limitations remain.