CEGA: A Cost-Effective Approach for Graph-Based Model Extraction and Acquisition

📅 2025-06-21

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the challenge of reconstructing graph neural networks (GNNs) under stringent constraints: low labeling budgets, prohibitions on batched queries, and scarcity of initial labeled nodes. We propose an iterative node querying framework tailored for non-adversarial research settings. Our method jointly leverages graph structural features and model response signals, adaptively selecting the most informative nodes based on historical feedback to enable efficient multi-round sampling and model refinement. Distinct from adversarial model extraction attacks, we reframe security concerns as a low-resource learning paradigm and introduce— for the first time—the concept of GNN knowledge transfer under query-limited conditions. Extensive experiments on multiple benchmark graph datasets demonstrate that our approach achieves superior accuracy, model fidelity, and F1 score using significantly fewer queries, thereby balancing scientific utility with security awareness.

Technology Category

Application Category

📝 Abstract

Graph Neural Networks (GNNs) have demonstrated remarkable utility across diverse applications, and their growing complexity has made Machine Learning as a Service (MLaaS) a viable platform for scalable deployment. However, this accessibility also exposes GNN to serious security threats, most notably model extraction attacks (MEAs), in which adversaries strategically query a deployed model to construct a high-fidelity replica. In this work, we evaluate the vulnerability of GNNs to MEAs and explore their potential for cost-effective model acquisition in non-adversarial research settings. Importantly, adaptive node querying strategies can also serve a critical role in research, particularly when labeling data is expensive or time-consuming. By selectively sampling informative nodes, researchers can train high-performing GNNs with minimal supervision, which is particularly valuable in domains such as biomedicine, where annotations often require expert input. To address this, we propose a node querying strategy tailored to a highly practical yet underexplored scenario, where bulk queries are prohibited, and only a limited set of initial nodes is available. Our approach iteratively refines the node selection mechanism over multiple learning cycles, leveraging historical feedback to improve extraction efficiency. Extensive experiments on benchmark graph datasets demonstrate our superiority over comparable baselines on accuracy, fidelity, and F1 score under strict query-size constraints. These results highlight both the susceptibility of deployed GNNs to extraction attacks and the promise of ethical, efficient GNN acquisition methods to support low-resource research environments.

Problem

Research questions and friction points this paper is trying to address.

Evaluates GNN vulnerability to model extraction attacks (MEAs).

Proposes cost-effective node querying for limited-resource GNN training.

Addresses ethical GNN acquisition in low-resource research settings.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive node querying for efficient GNN training

Iterative refinement of node selection mechanism

Cost-effective model acquisition with limited queries

🔎 Similar Papers

The Role of Graph Topology in the Performance of Biomedical Knowledge Graph Completion Models