🤖 AI Summary
This work addresses the challenge of reconstructing graph neural networks (GNNs) under stringent constraints: low labeling budgets, prohibitions on batched queries, and scarcity of initial labeled nodes. We propose an iterative node querying framework tailored for non-adversarial research settings. Our method jointly leverages graph structural features and model response signals, adaptively selecting the most informative nodes based on historical feedback to enable efficient multi-round sampling and model refinement. Distinct from adversarial model extraction attacks, we reframe security concerns as a low-resource learning paradigm and introduce— for the first time—the concept of GNN knowledge transfer under query-limited conditions. Extensive experiments on multiple benchmark graph datasets demonstrate that our approach achieves superior accuracy, model fidelity, and F1 score using significantly fewer queries, thereby balancing scientific utility with security awareness.
📝 Abstract
Graph Neural Networks (GNNs) have demonstrated remarkable utility across diverse applications, and their growing complexity has made Machine Learning as a Service (MLaaS) a viable platform for scalable deployment. However, this accessibility also exposes GNN to serious security threats, most notably model extraction attacks (MEAs), in which adversaries strategically query a deployed model to construct a high-fidelity replica. In this work, we evaluate the vulnerability of GNNs to MEAs and explore their potential for cost-effective model acquisition in non-adversarial research settings. Importantly, adaptive node querying strategies can also serve a critical role in research, particularly when labeling data is expensive or time-consuming. By selectively sampling informative nodes, researchers can train high-performing GNNs with minimal supervision, which is particularly valuable in domains such as biomedicine, where annotations often require expert input. To address this, we propose a node querying strategy tailored to a highly practical yet underexplored scenario, where bulk queries are prohibited, and only a limited set of initial nodes is available. Our approach iteratively refines the node selection mechanism over multiple learning cycles, leveraging historical feedback to improve extraction efficiency. Extensive experiments on benchmark graph datasets demonstrate our superiority over comparable baselines on accuracy, fidelity, and F1 score under strict query-size constraints. These results highlight both the susceptibility of deployed GNNs to extraction attacks and the promise of ethical, efficient GNN acquisition methods to support low-resource research environments.