Topological Metric for Unsupervised Embedding Quality Evaluation

📅 2025-12-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In unsupervised representation learning, embedding quality assessment has long been constrained by strong assumptions—such as linear separability or covariance structure—due to the absence of ground-truth labels. To address this, we propose *Persistence*, the first unsupervised, topology-aware evaluation metric grounded in persistent homology. It quantifies multi-scale geometric structure and topological richness of embedding spaces, enabling a unified characterization of global, nonlinear patterns without requiring labels or model-specific assumptions—thereby overcoming theoretical limitations of conventional metrics. Empirically validated across diverse domains, Persistence achieves the highest correlation with downstream task performance (average Pearson *r* = 0.89), significantly outperforming existing unsupervised evaluation methods. It effectively supports model selection and hyperparameter optimization.

Technology Category

Application Category

📝 Abstract
Modern representation learning increasingly relies on unsupervised and self-supervised methods trained on large-scale unlabeled data. While these approaches achieve impressive generalization across tasks and domains, evaluating embedding quality without labels remains an open challenge. In this work, we propose Persistence, a topology-aware metric based on persistent homology that quantifies the geometric structure and topological richness of embedding spaces in a fully unsupervised manner. Unlike metrics that assume linear separability or rely on covariance structure, Persistence captures global and multi-scale organization. Empirical results across diverse domains show that Persistence consistently achieves top-tier correlations with downstream performance, outperforming existing unsupervised metrics and enabling reliable model and hyperparameter selection.
Problem

Research questions and friction points this paper is trying to address.

Evaluates unsupervised embedding quality without labels
Quantifies geometric structure and topological richness of embeddings
Enables reliable model and hyperparameter selection via correlation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Topology-aware metric based on persistent homology
Quantifies geometric structure and topological richness unsupervised
Captures global and multi-scale organization of embeddings
🔎 Similar Papers
No similar papers found.