🤖 AI Summary
To address class imbalance and inadequate manifold structure modeling in high-dimensional medical imaging data, this paper proposes the Hellinger–Kantorovich (HK) metric based on unbalanced optimal transport (UOT), establishing a unified framework for dimensionality reduction and supervised/unsupervised learning. Unlike Euclidean distance or balanced optimal transport (OT), the HK metric explicitly accounts for mass non-conservation, yielding greater robustness in characterizing heterogeneous sample distributions. Comprehensive evaluation on benchmarks including MedMNIST demonstrates that our method significantly outperforms standard OT in classification—achieving superior performance in 81% of experimental settings—and surpasses both Euclidean distance and OT baselines in clustering across 58% of scenarios. The core contribution lies in the first integration of the HK metric into manifold learning pipelines, thereby introducing a representation learning paradigm for imbalanced medical data that jointly ensures geometric fidelity and statistical robustness.
📝 Abstract
This paper proposes the use of the Hellinger--Kantorovich metric from unbalanced optimal transport (UOT) in a dimensionality reduction and learning (supervised and unsupervised) pipeline. The performance of UOT is compared to that of regular OT and Euclidean-based dimensionality reduction methods on several benchmark datasets including MedMNIST. The experimental results demonstrate that, on average, UOT shows improvement over both Euclidean and OT-based methods as verified by statistical hypothesis tests. In particular, on the MedMNIST datasets, UOT outperforms OT in classification 81% of the time. For clustering MedMNIST, UOT outperforms OT 83% of the time and outperforms both other metrics 58% of the time.