Improving Graph Out-of-distribution Generalization on Real-world Data

📅 2024-07-14
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing out-of-distribution (OoD) generalization methods for graph-structured data over-rely on strong causal assumptions, neglect environmental dynamics, and suffer from synthetic data bias. Method: We propose DEROG—a novel framework that establishes two theoretical foundations: the Environment–Label Dependency Theorem and the Variable Rationale Invariance Theorem—thereby relaxing the restrictive assumption of strict independence between environments and invariant subgraphs. DEROG employs a generalized Bayesian variational inference approach, integrated with an EM algorithm for end-to-end optimization, without requiring prior knowledge of environments or graph rationales. Contribution/Results: DEROG achieves significant improvements over state-of-the-art methods across multiple real-world graph benchmarks under diverse distribution shifts. Its implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Existing methods for graph out-of-distribution (OOD) generalization primarily rely on empirical studies on synthetic datasets. Such approaches tend to overemphasize the causal relationships between invariant sub-graphs and labels, thereby neglecting the non-negligible role of environment in real-world scenarios. In contrast to previous studies that impose rigid independence assumptions on environments and invariant sub-graphs, this paper presents the theorems of environment-label dependency and mutable rationale invariance, where the former characterizes the usefulness of environments in determining graph labels while the latter refers to the mutable importance of graph rationales. Based on analytic investigations, a novel variational inference based method named ``Probability Dependency on Environments and Rationales for OOD Graphs on Real-world Data'' (DEROG) is introduced. To alleviate the adverse effect of unknown prior knowledge on environments and rationales, DEROG utilizes generalized Bayesian inference. Further, DEROG employs an EM-based algorithm for optimization. Finally, extensive experiments on real-world datasets under different distribution shifts are conducted to show the superiority of DEROG. Our code is publicly available at https://anonymous.4open.science/r/DEROG-536B.
Problem

Research questions and friction points this paper is trying to address.

Addresses limitations in graph OOD generalization methods
Explores environment-label dependency and mutable rationale invariance
Proposes DEROG for real-world data with distribution shifts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses variational inference for OOD graphs
Employs generalized Bayesian inference
Optimizes with EM-based algorithm
🔎 Similar Papers
No similar papers found.
C
Can Xu
East China Normal University, Shanghai, China
Y
Yao Cheng
East China Normal University, Shanghai, China
Jianxiang Yu
Jianxiang Yu
East China Normal University
Data miningLarge language models
Haosen Wang
Haosen Wang
Southeast University
J
Jingsong Lv
Zhejiang Lab, Hangzhou, China
X
Xiang Li
East China Normal University, Shanghai, China