Explanation-Preserving Augmentation for Semi-Supervised Graph Representation Learning

πŸ“… 2024-10-16
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing self-supervised graph representation learning methods predominantly rely on data perturbation for augmentation, neglecting semantic consistency and thereby limiting representation quality. To address this, we propose the Explanation-Preserving Augmentation (EPA) frameworkβ€”the first to integrate graph explanation techniques (e.g., GNNExplainer variants) into graph augmentation design. EPA trains a lightweight explainer on a small set of labeled nodes to identify semantically critical substructures, then generates semantically consistent augmented graphs accordingly. Coupled with subgraph sampling, feature masking, and contrastive learning, EPA enables semantic-aware representation learning within semi-supervised GNNs. Theoretical analysis establishes its robustness to structural noise, while extensive experiments demonstrate that EPA consistently outperforms state-of-the-art semantic-agnostic methods across multiple benchmarks, significantly improving both semantic fidelity and generalization performance.

Technology Category

Application Category

πŸ“ Abstract
Graph representation learning (GRL), enhanced by graph augmentation methods, has emerged as an effective technique achieving performance improvements in wide tasks such as node classification and graph classification. In self-supervised GRL, paired graph augmentations are generated from each graph. Its objective is to infer similar representations for augmentations of the same graph, but maximally distinguishable representations for augmentations of different graphs. Analogous to image and language domains, the desiderata of an ideal augmentation method include both (1) semantics-preservation; and (2) data-perturbation; i.e., an augmented graph should preserve the semantics of its original graph while carrying sufficient variance. However, most existing (un-)/self-supervised GRL methods focus on data perturbation but largely neglect semantics preservation. To address this challenge, in this paper, we propose a novel method, Explanation-Preserving Augmentation (EPA), that leverages graph explanation techniques for generating augmented graphs that can bridge the gap between semantics-preservation and data-perturbation. EPA first uses a small number of labels to train a graph explainer to infer the sub-structures (explanations) that are most relevant to a graph's semantics. These explanations are then used to generate semantics-preserving augmentations for self-supervised GRL, namely EPA-GRL. We demonstrate theoretically, using an analytical example, and through extensive experiments on a variety of benchmark datasets that EPA-GRL outperforms the state-of-the-art (SOTA) GRL methods, which are built upon semantics-agnostic data augmentations.
Problem

Research questions and friction points this paper is trying to address.

Enhancing graph augmentations by preserving semantic meaning in self-supervised learning
Addressing suboptimal solutions from data-perturbation only methods in graph representation
Leveraging graph explanations to generate semantics-preserving augmentations semi-supervisedly
Innovation

Methods, ideas, or system contributions that make the work stand out.

Explanation-preserving augmentation for graph representation learning
Leveraging graph explanations to ensure semantics-preservation
Semi-supervised framework combining explainer training with augmentation
πŸ”Ž Similar Papers
No similar papers found.