🤖 AI Summary
High-dimensional data—such as biological or image datasets—often exhibit substantial intra-class variance and nonlinear manifold structures, which challenge existing dimensionality reduction methods in effectively separating clusters and resolving subclusters. To address this, this work proposes MAPLE, a method that enhances UMAP’s manifold modeling capability through self-supervised learning and introduces Maximum-Margin Capacity Representations (MMCRs) to compress variance among locally similar points while amplifying dissimilarities between distinct ones. This yields a more accurate characterization of low-dimensional manifold geometry. Maintaining computational overhead comparable to UMAP, MAPLE significantly improves inter-cluster separation and subcluster resolution across diverse high-dimensional datasets, producing clearer and more refined visualizations.
📝 Abstract
We present a new nonlinear dimensionality reduction method, MAPLE, that enhances UMAP by improving manifold modeling. MAPLE employs a self-supervised learning approach to more efficiently encode low-dimensional manifold geometry. Central to this approach are maximum manifold capacity representations (MMCRs), which help untangle complex manifolds by compressing variances among locally similar data points while amplifying variance among dissimilar data points. This design is particularly effective for high-dimensional data with substantial intra-cluster variance and curved manifold structures, such as biological or image data. Our qualitative and quantitative evaluations demonstrate that MAPLE can produce clearer visual cluster separations and finer subcluster resolution than UMAP while maintaining comparable computational cost.