A Systematic Study of Model Extraction Attacks on Graph Foundation Models

📅 2025-11-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper presents the first systematic study of model extraction attacks (MEAs) against graph foundation models (GFMs), addressing critical gaps in prior work—namely, its focus on small, single-graph models and failure to account for GFMs’ multimodality, cross-domain generalization, and high computational cost. Method: We formally define six realistic MEA scenarios tailored to GFMs, encompassing domain-level and graph-level targets, architectural mismatch, low query budgets, partial node access, and training data distribution shifts. We propose a lightweight embedding regression attack that bypasses reliance on pretraining data; instead, it employs supervised regression to align surrogate model outputs with those of the victim’s text encoder, enabling high-fidelity black-box replication while preserving zero-shot inference capability. Results: Evaluated across seven benchmark datasets, our method achieves near-identical performance to the original GFM (accuracy loss ≈ 0) under extremely low query budgets, exposing severe extractability vulnerabilities in current GFMs.

Technology Category

Application Category

📝 Abstract
Graph machine learning has advanced rapidly in tasks such as link prediction, anomaly detection, and node classification. As models scale up, pretrained graph models have become valuable intellectual assets because they encode extensive computation and domain expertise. Building on these advances, Graph Foundation Models (GFMs) mark a major step forward by jointly pretraining graph and text encoders on massive and diverse data. This unifies structural and semantic understanding, enables zero-shot inference, and supports applications such as fraud detection and biomedical analysis. However, the high pretraining cost and broad cross-domain knowledge in GFMs also make them attractive targets for model extraction attacks (MEAs). Prior work has focused only on small graph neural networks trained on a single graph, leaving the security implications for large-scale and multimodal GFMs largely unexplored. This paper presents the first systematic study of MEAs against GFMs. We formalize a black-box threat model and define six practical attack scenarios covering domain-level and graph-specific extraction goals, architectural mismatch, limited query budgets, partial node access, and training data discrepancies. To instantiate these attacks, we introduce a lightweight extraction method that trains an attacker encoder using supervised regression of graph embeddings. Even without contrastive pretraining data, this method learns an encoder that stays aligned with the victim text encoder and preserves its zero-shot inference ability on unseen graphs. Experiments on seven datasets show that the attacker can approximate the victim model using only a tiny fraction of its original training cost, with almost no loss in accuracy. These findings reveal that GFMs greatly expand the MEA surface and highlight the need for deployment-aware security defenses in large-scale graph learning systems.
Problem

Research questions and friction points this paper is trying to address.

Investigates model extraction attacks on Graph Foundation Models' security vulnerabilities
Exposes how attackers can replicate models with minimal training cost
Reveals expanded attack surface in multimodal graph-text learning systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight extraction method trains attacker encoder
Supervised regression of graph embeddings for alignment
Preserves zero-shot inference with minimal training cost
🔎 Similar Papers
No similar papers found.
Haoyan Xu
Haoyan Xu
University of Southern California
Machine Learning
R
Ruizhi Qian
University of Southern California
Jiate Li
Jiate Li
University of Southern California
Yushun Dong
Yushun Dong
Assistant Professor, Department of Computer Science, Florida State University
AI SecurityAI IntegrityGraph Machine LearningLLMs
M
Minghao Lin
University of Southern California
H
Hanson Yan
University of Southern California
Z
Zhengtao Yao
University of Southern California
Qinghua Liu
Qinghua Liu
OpenAI
Machine LearningReinforcement LearningGame Theory
J
Junhao Dong
Nanyang Technological University
R
Ruopeng Huang
University of Southern California
Y
Yue Zhao
University of Southern California
Mengyuan Li
Mengyuan Li
University of Southern California
Hardware SecurityTrusted Execution EnvironmentCloud computing