Subgraph Gaussian Embedding Contrast for Self-Supervised Graph Representation Learning

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

To address distribution mismatch among subgraph contrastive pairs and insufficient structural fidelity in self-supervised graph representation learning (SSL-GRL) for high-dimensional graph data, this paper proposes a contrastive learning framework based on subgraph Gaussian embedding. The method explicitly models semantic and structural uncertainty of subgraphs via a novel subgraph Gaussian embedding module—the first of its kind. It jointly leverages the Wasserstein distance and Gromov–Wasserstein distance to measure subgraph similarity, thereby aligning both node feature distributions and topological structures, enhancing robustness and distribution controllability of contrastive learning. Extensive experiments on multiple benchmark graph datasets demonstrate that the proposed approach significantly outperforms or matches state-of-the-art methods, empirically validating the critical role of subgraph-level distribution modeling in SSL-GRL performance.

Technology Category

Application Category

📝 Abstract

Graph Representation Learning (GRL) is a fundamental task in machine learning, aiming to encode high-dimensional graph-structured data into low-dimensional vectors. Self-Supervised Learning (SSL) methods are widely used in GRL because they can avoid expensive human annotation. In this work, we propose a novel Subgraph Gaussian Embedding Contrast (SubGEC) method. Our approach introduces a subgraph Gaussian embedding module, which adaptively maps subgraphs to a structured Gaussian space, ensuring the preservation of input subgraph characteristics while generating subgraphs with a controlled distribution. We then employ optimal transport distances, more precisely the Wasserstein and Gromov-Wasserstein distances, to effectively measure the similarity between subgraphs, enhancing the robustness of the contrastive learning process. Extensive experiments across multiple benchmarks demonstrate that method~outperforms or presents competitive performance against state-of-the-art approaches. Our findings provide insights into the design of SSL methods for GRL, emphasizing the importance of the distribution of the generated contrastive pairs.

Problem

Research questions and friction points this paper is trying to address.

Encode graph data into low-dimensional vectors without human annotation

Measure subgraph similarity using optimal transport distances

Improve robustness in self-supervised graph representation learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Subgraph Gaussian embedding for structured representation

Optimal transport distances measure subgraph similarity

Controlled distribution enhances contrastive learning robustness

🔎 Similar Papers

No similar papers found.