Revisiting Bruck: Phase-Efficient All-to-All Communication in Reconfigurable Networks

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the all-to-all communication bottleneck in distributed machine learning and high-performance computing over reconfigurable optical networks by proposing ReTri, a novel scheme that co-optimizes communication patterns and network reconfiguration strategies. ReTri innovatively adapts the Bruck algorithm to reconfigurable architectures through a bidirectional pairwise exchange mechanism based on balanced ternary block propagation, completing all-to-all communication in ⌈log₃n⌉ phases. It further amortizes reconfiguration overhead by reusing topological states across communication phases. Experimental results demonstrate that ReTri achieves up to 10× speedup over static network approaches and improves performance by up to 2.1× compared to existing reconfigurable Bruck-based methods.

📝 Abstract

All-to-All communication is a key performance bottleneck for distributed machine learning (ML) and high-performance computing (HPC) workloads, where dense traffic increasingly stresses scale-up interconnects. While these ML and HPC workloads have driven unprecedented infrastructure demand, optical reconfigurable networks (ORNs) offer a promising path forward. By adapting the physical topology to the active workload, they improve communication cost and bandwidth utilization. However, their benefit is critically contingent on whether the collective consists of structured phases that can be served by sparse and reusable topology states. In this paper, we revisit Bruck's All-to-All implementation and demonstrate the benefits of topology optimization in which both communication pattern and reconfiguration strategy are co-designed. We present ReTri, a bidirectional All-to-All schedule for ORNs. ReTri uses balanced ternary block propagation to complete All-to-All in $\lceil \log_3 n\rceil$ phases. The induced reconfiguration strategy from ReTri's pairwise bidirectional exchanges allow reconfiguration delays to be amortized across multiple phases. Preliminary simulations show that ReTri improves completion time by up to $10\times$ over static All-to-All, even for millisecond-scale reconfiguration delays, and improving reconfigurable Bruck by up to $2.1\times$.

Problem

Research questions and friction points this paper is trying to address.

All-to-All communication

reconfigurable networks

communication bottleneck

topology optimization

distributed machine learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reconfigurable Networks

All-to-All Communication

Bruck Algorithm