🤖 AI Summary
This paper addresses the limited semantic modeling capability in graph representation learning by proposing Graffe, the first self-supervised framework for graphs based on diffusion probabilistic models (DPMs). Methodologically, Graffe jointly optimizes a graph encoder and a conditional diffusion decoder: the encoder extracts compact node- or graph-level representations, which serve as conditioning signals to guide the denoising process; the decoder performs conditional denoising via score matching. Theoretically, we establish for the first time that this denoising objective implicitly maximizes conditional mutual information and derive a computable lower bound thereof. Empirically, Graffe achieves superior node and graph classification performance under linear-probe evaluation, attaining state-of-the-art results on 9 out of 11 real-world benchmarks—systematically demonstrating the effectiveness and promise of diffusion models in graph representation learning.
📝 Abstract
Diffusion probabilistic models (DPMs), widely recognized for their potential to generate high-quality samples, tend to go unnoticed in representation learning. While recent progress has highlighted their potential for capturing visual semantics, adapting DPMs to graph representation learning remains in its infancy. In this paper, we introduce Graffe, a self-supervised diffusion model proposed for graph representation learning. It features a graph encoder that distills a source graph into a compact representation, which, in turn, serves as the condition to guide the denoising process of the diffusion decoder. To evaluate the effectiveness of our model, we first explore the theoretical foundations of applying diffusion models to representation learning, proving that the denoising objective implicitly maximizes the conditional mutual information between data and its representation. Specifically, we prove that the negative logarithm of the denoising score matching loss is a tractable lower bound for the conditional mutual information. Empirically, we conduct a series of case studies to validate our theoretical insights. In addition, Graffe delivers competitive results under the linear probing setting on node and graph classification tasks, achieving state-of-the-art performance on 9 of the 11 real-world datasets. These findings indicate that powerful generative models, especially diffusion models, serve as an effective tool for graph representation learning.