🤖 AI Summary
This work addresses the challenges of high computational overhead and redundant source-domain data contributions in multi-domain graph pre-training. The authors propose MDGMIX, a novel framework that first identifies and characterizes this data redundancy issue. It introduces a boundary-aware subgraph mixing strategy coupled with a dual-granularity discriminative loss—comprising coarse-grained domain discrimination and fine-grained domain decomposition—to construct cross-domain challenging subgraphs during pre-training, thereby disentangling shared and domain-specific patterns. During adaptation, a lightweight prompt-weighting mechanism enables efficient knowledge transfer. Experimental results demonstrate that MDGMIX significantly outperforms strong baselines on few-shot classification tasks while achieving superior time and memory efficiency.
📝 Abstract
Multi-domain graph pre-training is a crucial step in constructing foundational graph models with cross-domain generalization capabilities. However, existing methods predominantly rely on jointly training all source domain graphs, resulting in high computational costs. Furthermore, it remains unclear whether all source domain graph data contribute equally to effective transfer. This paper empirically reveals significant data redundancy in multi-domain graph pre-training. Based on this finding, we propose the Multi-domain Graph Pre-training Framework, MDGMIX, which combines boundary-aware subgraph mixing with hierarchical discrimination. By selecting boundary nodes to construct challenging mixed-domain subgraphs, MDGMIX employs coarse-grained domain discrimination and fine-grained domain decomposition losses to decouple shared patterns from domain-specific patterns. During adaptation, MDGMIX employs a lightweight prompt weighting mechanism to transfer source domain knowledge. Extensive experiments demonstrate that MDGMIX consistently outperforms strong baselines in few-shot classification tasks while exhibiting superior time and memory efficiency. The code is available at: https://github.com/zhengziyu77/MDGMIX.