🤖 AI Summary
Traditional manifold learning methods are constrained by symmetric Riemannian geometry, limiting their ability to capture inherent asymmetric structures in data. This work introduces Finsler geometry into general-purpose manifold learning for the first time, constructing an asymmetric distance metric and enabling embeddings in Finsler spaces to overcome symmetry constraints. We propose asymmetric extensions of established algorithms—Finsler t-SNE and Finsler UMAP—applicable to arbitrary data types. Experimental results demonstrate that our approach effectively uncovers structural patterns, such as density hierarchies, that are missed by conventional symmetric methods, consistently yielding higher-quality low-dimensional representations than Euclidean embeddings on both synthetic and large-scale real-world datasets.
📝 Abstract
Manifold learning is a fundamental task at the core of data analysis and visualisation. It aims to capture the simple underlying structure of complex high-dimensional data by preserving pairwise dissimilarities in low-dimensional embeddings. Traditional methods rely on symmetric Riemannian geometry, thus forcing symmetric dissimilarities and embedding spaces, e.g. Euclidean. However, this discards in practice valuable asymmetric information inherent to the non-uniformity of data samples. We suggest to harness this asymmetry by switching to Finsler geometry, an asymmetric generalisation of Riemannian geometry, and propose a Finsler manifold learning pipeline that constructs asymmetric dissimilarities and embeds in a Finsler space. This greatly broadens the applicability of existing asymmetric embedders beyond traditionally directed data to any data. We also modernise asymmetric embedders by generalising current reference methods to asymmetry, like Finsler t-SNE and Finsler Umap. On controlled synthetic and large real datasets, we show that our asymmetric pipeline reveals valuable information lost in the traditional pipeline, e.g. density hierarchies, and consistently provides superior quality embeddings than their Euclidean counterparts.