pMSz: A Distributed Parallel Algorithm for Correcting Extrema and Morse Smale Segmentations in Lossy Compression

📅 2026-01-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Lossy compression often distorts topological features—such as critical points and Morse–Smale complexes—thereby compromising the accuracy of scientific analysis. This work proposes the first distributed parallel algorithm scalable to multiple GPUs for efficiently correcting piecewise-linear Morse–Smale complexes after compression. By preserving the steepest ascent and descent directions at each point, the method avoids explicit integral line computation. Combined with relaxed synchronization and communication optimization, it significantly reduces computational overhead. Evaluated on the Perlmutter supercomputer using 128 GPUs on real-world datasets, the approach achieves over 90% parallel efficiency, surpassing single-GPU limitations and substantially enhancing scalability for extreme-scale data.

Technology Category

Application Category

📝 Abstract
Lossy compression, widely used by scientists to reduce data from simulations, experiments, and observations, can distort features of interest even under bounded error. Such distortions may compromise downstream analyses and lead to incorrect scientific conclusions in applications such as combustion and cosmology. This paper presents a distributed and parallel algorithm for correcting topological features, specifically, piecewise linear Morse Smale segmentations (PLMSS), which decompose the domain into monotone regions labeled by their corresponding local minima and maxima. While a single GPU algorithm (MSz) exists for PLMSS correction after compression, no methodology has been developed that scales beyond a single GPU for extreme scale data. We identify the key bottleneck in scaling PLMSS correction as the parallel computation of integral paths, a communication-intensive computation that is notoriously difficult to scale. Instead of explicitly computing and correcting integral paths, our algorithm simplifies MSz by preserving steepest ascending and descending directions across all locations, thereby minimizing interprocess communication while introducing negligible additional storage overhead. With this simplified algorithm and relaxed synchronization, our method achieves over 90% parallel efficiency on 128 GPUs on the Perlmutter supercomputer for real world datasets.
Problem

Research questions and friction points this paper is trying to address.

lossy compression
Morse-Smale segmentation
topological feature correction
extreme scale data
distributed parallel algorithm
Innovation

Methods, ideas, or system contributions that make the work stand out.

distributed parallel algorithm
Morse-Smale segmentation
lossy compression correction
integral path simplification
extreme-scale data
🔎 Similar Papers
No similar papers found.
Yuxiao Li
Yuxiao Li
Bosch Corporate Research
Spatio-Temporal StatisticsGaussian ProcessKnowledge GraphGraph Neural Network
M
Mingze Xia
Oregon State University, Corvallis, OR, USA
X
Xin Liang
Oregon State University, Corvallis, OR, USA
B
Bei Wang
University of Utah, Salt Lake City, UT, USA
Robert Underwood
Robert Underwood
Assistant Computer Scientist, Argonne National Laboratory
Data for AI for ScienceLossy CompressionDistributed ComputingReliable Computer Infrastructure
S
S. Di
Argonne National Laboratory, Lemont, IL, USA
H
Hemant Sharma
Argonne National Laboratory, Lemont, IL, USA
D
D. Beniwal
Argonne National Laboratory, Lemont, IL, USA
Franck Cappello
Franck Cappello
Argonne National Laboratory, IEEE Fellow
Parallel ProcessingParallel ComputingHigh Performance ComputingFault ToleranceData Compression
Hanqi Guo
Hanqi Guo
The Ohio State University
Data visualization and analysis