🤖 AI Summary
To address the problem of spurious inter-manifold edges in nearest-neighbor graphs—induced by noisy data—that distort low-dimensional manifold structure, this paper proposes ORC-ManL, a novel manifold learning algorithm. Methodologically, it is the first to incorporate Ollivier–Ricci curvature (ORC) into manifold learning, establishing a theoretical link between negative ORC and spurious edges; it further integrates local metric distortion estimation to enable geometry-driven, provably convergent graph pruning. The framework unifies ORC computation, topological optimization, and persistent homology evaluation. Empirically, ORC-ManL significantly outperforms existing pruning methods across manifold learning, intrinsic dimension estimation, and single-cell RNA-seq clustering tasks, yielding 15–32% improvements in downstream accuracy. Crucially, its theoretical convergence guarantee is empirically validated.
📝 Abstract
We introduce ORC-ManL, a new algorithm to prune spurious edges from nearest neighbor graphs using a criterion based on Ollivier-Ricci curvature and estimated metric distortion. Our motivation comes from manifold learning: we show that when the data generating the nearest-neighbor graph consists of noisy samples from a low-dimensional manifold, edges that shortcut through the ambient space have more negative Ollivier-Ricci curvature than edges that lie along the data manifold. We demonstrate that our method outperforms alternative pruning methods and that it significantly improves performance on many downstream geometric data analysis tasks that use nearest neighbor graphs as input. Specifically, we evaluate on manifold learning, persistent homology, dimension estimation, and others. We also show that ORC-ManL can be used to improve clustering and manifold learning of single-cell RNA sequencing data. Finally, we provide empirical convergence experiments that support our theoretical findings.