🤖 AI Summary
To address collective communication bottlenecks in large-scale HPC systems—caused by network congestion, sparse global connectivity, and dense local connectivity—this paper introduces the Bine tree family: a novel communication tree architecture integrating binomial and negabinary number systems. By synergizing structural properties of binomial trees and butterfly networks, Bine trees enhance communication locality while substantially reducing traffic over inter-group global links. We design and implement eight Bine-based collective algorithms natively optimized for mainstream topologies—including Dragonfly, Dragonfly+, Fat-Tree, and Torus. Experimental evaluation across four large-scale supercomputing platforms demonstrates up to 33% reduction in global-link traffic and end-to-end performance improvements of up to 5×. Crucially, these gains remain consistent across diverse vector sizes and node counts, confirming robust scalability and topology adaptability.
📝 Abstract
Communication locality plays a key role in the performance of collective operations on large HPC systems, especially on oversubscribed networks where groups of nodes are fully connected internally but sparsely linked through global connections. We present Bine (binomial negabinary) trees, a family of collective algorithms that improve communication locality. Bine trees maintain the generality of binomial trees and butterflies while cutting global-link traffic by up to 33%. We implement eight Bine-based collectives and evaluate them on four large-scale supercomputers with Dragonfly, Dragonfly+, oversubscribed fat-tree, and torus topologies, achieving up to 5x speedups and consistent reductions in global-link traffic across different vector sizes and node counts.