🤖 AI Summary
This work presents the first systematic investigation of privacy risks—specifically membership inference attacks (MIAs)—against multi-domain graph pre-trained models (MD-GPTs). We address three key challenges: MD-GPTs’ strong generalization capability, the non-representativeness of shadow data, and the weak membership signal inherent in pre-trained representations. To overcome these, we propose MGP-MIA, a novel MIA framework comprising: (1) machine unlearning–enhanced overfitting to amplify membership signals; (2) high-fidelity shadow model construction via incremental learning; and (3) fine-grained membership discrimination based on embedding similarity. Extensive experiments across multiple MD-GPT architectures demonstrate that MGP-MIA significantly outperforms existing MIAs in attack success rate. Our findings expose severe privacy leakage vulnerabilities in current multi-domain graph pre-training paradigms. Moreover, MGP-MIA establishes a new methodology and empirical benchmark for privacy evaluation of graph foundation models.
📝 Abstract
Multi-domain graph pre-training has emerged as a pivotal technique in developing graph foundation models. While it greatly improves the generalization of graph neural networks, its privacy risks under membership inference attacks (MIAs), which aim to identify whether a specific instance was used in training (member), remain largely unexplored. However, effectively conducting MIAs against multi-domain graph pre-trained models is a significant challenge due to: (i) Enhanced Generalization Capability: Multi-domain pre-training reduces the overfitting characteristics commonly exploited by MIAs. (ii) Unrepresentative Shadow Datasets: Diverse training graphs hinder the obtaining of reliable shadow graphs. (iii) Weakened Membership Signals: Embedding-based outputs offer less informative cues than logits for MIAs. To tackle these challenges, we propose MGP-MIA, a novel framework for Membership Inference Attacks against Multi-domain Graph Pre-trained models. Specifically, we first propose a membership signal amplification mechanism that amplifies the overfitting characteristics of target models via machine unlearning. We then design an incremental shadow model construction mechanism that builds a reliable shadow model with limited shadow graphs via incremental learning. Finally, we introduce a similarity-based inference mechanism that identifies members based on their similarity to positive and negative samples. Extensive experiments demonstrate the effectiveness of our proposed MGP-MIA and reveal the privacy risks of multi-domain graph pre-training.