🤖 AI Summary
This study addresses the lack of systematic evaluation of dimensionality reduction methods in spatial transcriptomics. We establish a unified framework to benchmark PCA, NMF, autoencoders, VAEs, and hybrid embeddings across multiple parameter configurations. Methodologically, we introduce two biology-driven metrics—Cluster Marker Coherence (CMC) and Marker Exclusion Rate (MER)—and employ Pareto-optimal analysis for principled hyperparameter selection. Comprehensive evaluation integrates reconstruction error, explained variance, clustering consistency, and biological fidelity. Results show that NMF achieves superior marker gene enrichment; VAE attains the best trade-off between reconstruction quality and interpretability; and MER-guided reassignment improves average CMC by 12%, markedly enhancing spatial–molecular alignment. This work establishes a reproducible, scalable paradigm for biologically informed assessment of dimensionality reduction techniques in spatial omics.
📝 Abstract
We introduce a unified framework for evaluating dimensionality reduction techniques in spatial transcriptomics beyond standard PCA approaches. We benchmark six methods PCA, NMF, autoencoder, VAE, and two hybrid embeddings on a cholangiocarcinoma Xenium dataset, systematically varying latent dimensions ($k$=5-40) and clustering resolutions ($ρ$=0.1-1.2). Each configuration is evaluated using complementary metrics including reconstruction error, explained variance, cluster cohesion, and two novel biologically-motivated measures: Cluster Marker Coherence (CMC) and Marker Exclusion Rate (MER). Our results demonstrate distinct performance profiles: PCA provides a fast baseline, NMF maximizes marker enrichment, VAE balances reconstruction and interpretability, while autoencoders occupy a middle ground. We provide systematic hyperparameter selection using Pareto optimal analysis and demonstrate how MER-guided reassignment improves biological fidelity across all methods, with CMC scores improving by up to 12% on average. This framework enables principled selection of dimensionality reduction methods tailored to specific spatial transcriptomics analyses.