🤖 AI Summary
This study addresses the problem of super-spreader identification in multilayer networks. To overcome the poor generalizability and limited interpretability of existing methods, we propose an interpretable graph neural network (GNN) framework grounded in a novel four-dimensional temporal diffusion potential vector. We first construct a synthetic multilayer network diffusion dataset to enable systematic evaluation. The potential vector explicitly captures four key propagation properties: timeliness, breadth, depth, and robustness. Our architecture features a relation-agnostic encoder and a custom-designed aggregation layer, enabling cross-network generalization to multilayer graphs of arbitrary scale. Extensive experiments on both real-world and synthetic datasets demonstrate that our method significantly outperforms classical centrality measures and state-of-the-art GNNs in influence prediction and ranking tasks, achieving both high accuracy and structured interpretability through the learned potential dimensions.
📝 Abstract
Identifying super-spreaders can be framed as a subtask of the influence maximisation problem. It seeks to pinpoint agents within a network that, if selected as single diffusion seeds, disseminate information most effectively. Multilayer networks, a specific class of heterogeneous graphs, can capture diverse types of interactions (e.g., physical-virtual or professional-social), and thus offer a more accurate representation of complex relational structures. In this work, we introduce a novel approach to identifying super-spreaders in such networks by leveraging graph neural networks. To this end, we construct a dataset by simulating information diffusion across hundreds of networks - to the best of our knowledge, the first of its kind tailored specifically to multilayer networks. We further formulate the task as a variation of the ranking prediction problem based on a four-dimensional vector that quantifies each agent's spreading potential: (i) the number of activations; (ii) the duration of the diffusion process; (iii) the peak number of activations; and (iv) the simulation step at which this peak occurs. Our model, TopSpreadersNetwork, comprises a relationship-agnostic encoder and a custom aggregation layer. This design enables generalisation to previously unseen data and adapts to varying graph sizes. In an extensive evaluation, we compare our model against classic centrality-based heuristics and competitive deep learning methods. The results, obtained across a broad spectrum of real-world and synthetic multilayer networks, demonstrate that TopSpreadersNetwork achieves superior performance in identifying high-impact nodes, while also offering improved interpretability through its structured output.