Multiscale Grassmann Manifolds for Single-Cell Data Analysis

📅 2025-11-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Single-cell data analysis faces challenges in capturing intrinsic gene expression correlations and multi-scale geometric structures using conventional Euclidean vector representations. To address this, we propose a Grassmann manifold-based multi-scale embedding framework that unifies subspace geometry modeling with multi-scale representation learning. Specifically, we design a power-law scale-sampling function to adaptively integrate subspace geometric information across resolutions, and develop a differentiable Grassmann embedding module enabling nonlinear, structure-preserving dimensionality reduction of high-dimensional single-cell data. Extensive experiments on nine scRNA-seq datasets demonstrate substantial improvements in clustering stability (average +12.3% Adjusted Rand Index) and both local and global structural fidelity—particularly pronounced on medium- and small-scale datasets (n < 10k). Our approach establishes a novel geometric deep learning paradigm for modeling single-cell heterogeneity.

Technology Category

Application Category

📝 Abstract
Single-cell data analysis seeks to characterize cellular heterogeneity based on high-dimensional gene expression profiles. Conventional approaches represent each cell as a vector in Euclidean space, which limits their ability to capture intrinsic correlations and multiscale geometric structures. We propose a multiscale framework based on Grassmann manifolds that integrates machine learning with subspace geometry for single-cell data analysis. By generating embeddings under multiple representation scales, the framework combines their features from different geometric views into a unified Grassmann manifold. A power-based scale sampling function is introduced to control the selection of scales and balance in- formation across resolutions. Experiments on nine benchmark single-cell RNA-seq datasets demonstrate that the proposed approach effectively preserves meaningful structures and provides stable clustering performance, particularly for small to medium-sized datasets. These results suggest that Grassmann manifolds offer a coherent and informative foundation for analyzing single cell data.
Problem

Research questions and friction points this paper is trying to address.

Characterizing cellular heterogeneity from high-dimensional gene expression profiles
Capturing intrinsic correlations and multiscale geometric structures in data
Improving clustering performance for small to medium-sized single-cell datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Grassmann manifolds for single-cell data analysis
Integrates machine learning with subspace geometry
Employs power-based scale sampling for resolution balance