🤖 AI Summary
Accurately estimating intrinsic manifold curvature—particularly principal curvatures—from large-scale single-cell transcriptomic data remains a fundamental challenge due to high noise, sparsity, and nonlinear geometry. To address this, we propose Adaptive Local Principal Component Analysis (AdaL-PCA), a novel method that jointly models principal curvature estimation and PHATE embedding within a unified framework. AdaL-PCA introduces a data-driven, adaptive neighborhood selection strategy that dynamically balances local linearity and geometric fidelity, thereby substantially improving the robustness and accuracy of curvature estimation. On synthetic curved manifolds, AdaL-PCA achieves state-of-the-art performance in principal curvature estimation. When applied to mouse hematopoietic stem cell differentiation data, it successfully identifies geometric singularities and critical transitional states along developmental trajectories. These findings establish a new geometric paradigm for deciphering the underlying mechanisms of cell fate determination, linking differential geometry to biological interpretation in single-cell analysis.
📝 Abstract
The rapidly growing field of single-cell transcriptomic sequencing (scRNAseq) presents challenges for data analysis due to its massive datasets. A common method in manifold learning consists in hypothesizing that datasets lie on a lower dimensional manifold. This allows to study the geometry of point clouds by extracting meaningful descriptors like curvature. In this work, we will present Adaptive Local PCA (AdaL-PCA), a data-driven method for accurately estimating various notions of intrinsic curvature on data manifolds, in particular principal curvatures for surfaces. The model relies on local PCA to estimate the tangent spaces. The evaluation of AdaL-PCA on sampled surfaces shows state-of-the-art results. Combined with a PHATE embedding, the model applied to single-cell RNA sequencing data allows us to identify key variations in the cellular differentiation.