🤖 AI Summary
This work addresses the high computational cost of graph neural networks (GNNs) and the limited expressiveness of multilayer perceptrons (MLPs) in graph link prediction. To this end, it introduces a novel knowledge distillation paradigm: for the first time, lightweight heuristic methods—such as common neighbors—are employed as teachers to distill high-performance, graph-structure-agnostic MLPs. The proposed Ensemble Heuristic Distillation MLP (EHDM) integrates diverse heuristic signals via a learnable gating mechanism. Extensive experiments across ten benchmark datasets demonstrate that EHDM achieves an average accuracy improvement of 7.93% over baseline MLPs, while training 1.95–3.32× faster than GNNs. It consistently outperforms standard GNNs, GNN4LP, and conventional distillation approaches. EHDM thus bridges the long-standing trade-off between accuracy and efficiency in link prediction, delivering both high predictive performance and scalable inference without reliance on graph topology during deployment.
📝 Abstract
Link prediction is a crucial graph-learning task with applications including citation prediction and product recommendation. Distilling Graph Neural Networks (GNNs) teachers into Multi-Layer Perceptrons (MLPs) students has emerged as an effective approach to achieve strong performance and reducing computational cost by removing graph dependency. However, existing distillation methods only use standard GNNs and overlook alternative teachers such as specialized model for link prediction (GNN4LP) and heuristic methods (e.g., common neighbors). This paper first explores the impact of different teachers in GNN-to-MLP distillation. Surprisingly, we find that stronger teachers do not always produce stronger students: MLPs distilled from GNN4LP can underperform those distilled from simpler GNNs, while weaker heuristic methods can teach MLPs to near-GNN performance with drastically reduced training costs. Building on these insights, we propose Ensemble Heuristic-Distilled MLPs (EHDM), which eliminates graph dependencies while effectively integrating complementary signals via a gating mechanism. Experiments on ten datasets show an average 7.93% improvement over previous GNN-to-MLP approaches with 1.95-3.32 times less training time, indicating EHDM is an efficient and effective link prediction method.