🤖 AI Summary
This work addresses the inefficiency of traditional label propagation algorithms, which require recomputation over the entire graph when processing incremental batch data. To overcome this limitation, the authors propose DynLP, the first GPU-accelerated dynamic label propagation algorithm that operates on batches of newly added data. DynLP selectively updates only the sparse subgraph affected by incoming edges, thereby avoiding costly global recomputation. By integrating a parallel dynamic graph update strategy tailored to GPU architecture, the method achieves substantial performance gains on large-scale graphs. Experimental results demonstrate an average speedup of 13× and a peak speedup of up to 102× compared to existing approaches, significantly advancing the state of the art in scalable and efficient label propagation.
📝 Abstract
Semi-supervised learning aims to infer class labels using only a small fraction of labeled data. In graph-based semi-supervised learning, this is typically achieved through label propagation to predict labels of unlabeled nodes. However, in real-world applications, data often arrive incrementally in batches. Each time a new batch appears, reapplying the traditional label propagation algorithm to recompute all labels is redundant, computationally intensive, and inefficient. To address the absence of an efficient label propagation update method, we propose DynLP, a novel GPU-centric Dynamic Batched Parallel Label Propagation algorithm that performs only the necessary updates, propagating changes to the relevant subgraph without requiring full recalculation. By exploiting GPU architectural optimizations, our algorithm achieves on average 13x and upto 102x speedup on large-scale datasets compared to state-of-the-art approaches.