SOLANET: Distributed Neighbor Graph Construction on GPU-Accelerated Systems

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses the challenges of irregular computation and scalability bottlenecks in constructing k-nearest neighbor graphs on distributed GPU systems. The authors propose a scalable distributed framework that first builds local neighbor graphs from data shards on individual GPUs and then leverages MPI one-sided communication to fetch remote neighbors for global approximate nearest neighbor refinement. Notably, they introduce the first lock-free, single-node graph construction algorithm tailored for AMD GPUs. Their approach outperforms existing GPU-based methods on a single MI300A APU and achieves strong scaling speedups of 11× and 6.9× on 512 APUs for datasets of 1 billion and 2 billion points, respectively.

📝 Abstract

Neighbor graphs capture relationships among data points and are widely used in data analytics and AI workloads. Many studies have explored approximate construction methods for single-node systems, including GPUs. However, extending this to distributed systems for larger data and further acceleration remains challenging due to irregular computation patterns. We present SOLANET, a GPU-accelerated distributed neighbor graph construction toolkit. SOLANET first constructs local graphs on each GPU after data partitioning and then refines them via approximate nearest neighbor (ANN) searches over remote graphs pulled from other GPUs using MPI one-sided operations. SOLANET also provides a lock-free single-GPU neighbor graph construction algorithm for AMD GPUs. Our single-GPU implementation outperforms a state-of-the-art GPU-based approximate neighbor graph construction implementation across multiple datasets on a single MI300A APU. Furthermore, SOLANET demonstrates 11X speedup from 32 to 512 APUs for 1 billion data points and 6.9x speedup from 64 to 512 APUs for 2 billion points.

Problem

Research questions and friction points this paper is trying to address.

distributed systems

neighbor graph construction

GPU acceleration

approximate nearest neighbor

irregular computation

Innovation

Methods, ideas, or system contributions that make the work stand out.

distributed neighbor graph

GPU acceleration

approximate nearest neighbor (ANN)