Unsupervised Ground Metric Learning

📅 2025-07-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses distance metric learning in unsupervised settings, aiming to achieve robust data clustering without label supervision. We propose a unified framework that jointly optimizes optimal transport cost matrices over both samples and features. By incorporating a nonlinear mapping and solving for positive eigenvectors, our approach subsumes diverse distance forms—including Mahalanobis distance and graph Laplacian-regularized distance—within a single model. Methodologically, we design a novel stochastic functional iteration algorithm and, for the first time, establish its linear convergence without requiring operator quasi-contractivity assumptions. Furthermore, we seamlessly integrate Mahalanobis-type metrics and graph Laplacian regularization into the optimal transport paradigm, substantially reducing computational complexity. Extensive experiments demonstrate the method’s effectiveness, flexibility across multiple distance models, and numerical stability.

Technology Category

Application Category

📝 Abstract
Data classification without access to labeled samples remains a challenging problem. It usually depends on an appropriately chosen distance between features, a topic addressed in metric learning. Recently, Huizing, Cantini and Peyré proposed to simultaneously learn optimal transport (OT) cost matrices between samples and features of the dataset. This leads to the task of finding positive eigenvectors of a certain nonlinear function that maps cost matrices to OT distances. Having this basic idea in mind, we consider both the algorithmic and the modeling part of unsupervised metric learning. First, we examine appropriate algorithms and their convergence. In particular, we propose to use the stochastic random function iteration algorithm and prove that it converges linearly for our setting, although our operators are not paracontractive as it was required for convergence so far. Second, we ask the natural question if the OT distance can be replaced by other distances. We show how Mahalanobis-like distances fit into our considerations. Further, we examine an approach via graph Laplacians. In contrast to the previous settings, we have just to deal with linear functions in the wanted matrices here, so that simple algorithms from linear algebra can be applied.
Problem

Research questions and friction points this paper is trying to address.

Learn optimal transport cost matrices without labeled samples
Study convergence of stochastic algorithms for metric learning
Explore alternative distances like Mahalanobis and graph Laplacians
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised metric learning via optimal transport
Stochastic random function iteration algorithm
Mahalanobis-like and graph Laplacian distances