🤖 AI Summary
Traditional Wasserstein distances for comparing persistence diagrams and merge trees neglect the spatial locations of topological features, limiting discriminative power in geometric contexts.
Method: We propose a region-semantic-aware weighted Wasserstein metric that models spatial distributions of topological features via extremum-aligned regions—geometrically meaningful subdomains aligned with function extrema. A tunable parameter controls the influence of regional attributes on the distance, while subset sampling and attribute compression enable computationally and memory-efficient optimization within the optimal transport framework.
Results: Experiments demonstrate substantial improvement in topological discrimination accuracy, with average computation times under several minutes. The metric successfully enables critical phase identification in time-varying ensembles, topological evolution tracking, and dimensionality-reduced visualization. It provides a scalable, expressive tool for large-scale topological data analysis, balancing fidelity, interpretability, and computational tractability.
📝 Abstract
This paper presents a generalization of the Wasserstein distance for both persistence diagrams and merge trees [20], [66] that takes advantage of the regions of their topological features in the input domain. Specifically, we redefine the comparison of topological features as a distance between the values of their extrema-aligned regions. It results in a more discriminative metric than the classical Wasserstein distance and generalizes it through an input parameter adjusting the impact of the region properties in the distance. We present two strategies to control both computation time and memory storage of our method by respectively enabling the use of subsets of the regions in the computation, and by compressing the regions' properties to obtain low-memory representations. Extensive experiments on openly available ensemble data demonstrate the efficiency of our method, with running times on the orders of minutes on average. We show the utility of our contributions with two applications. First, we use the assignments between topological features provided by our method to track their evolution in time-varying ensembles and propose the temporal persistence curves to facilitate the understanding of how these features appear, disappear and change over time. Second, our method allows to compute a distance matrix of an ensemble that can be used for dimensionality reduction purposes and visually represent in 2D all its members, we show that such distance matrices also allow to detect key phases in the ensemble. Finally, we provide a C++ implementation that can be used to reproduce our results.