🤖 AI Summary
Transfer learning for high-dimensional network data faces dual challenges: severe label scarcity in the target domain and structural dependencies among nodes. Method: This paper proposes Network Convolutional Regression (NCR), a transfer framework that models local dependencies between node responses and their own as well as neighbor features. It introduces a two-stage transfer algorithm and a source-domain adaptive selection mechanism. Under the Erdős–Rényi random graph assumption, we provide the first theoretical proof that transfer accelerates Lasso estimation convergence. Contribution/Results: By integrating cross-network domain shift correction and structure-aware regularization, NCR significantly improves prediction accuracy on both synthetic benchmarks and real-world Sina Weibo data. Its gains are especially pronounced when the target domain contains only a handful of labeled nodes. The framework thus offers both theoretical rigor—establishing convergence-rate benefits of transfer—and practical efficacy—demonstrated via empirical superiority over state-of-the-art baselines.
📝 Abstract
Transfer learning enhances model performance by utilizing knowledge from related domains, particularly when labeled data is scarce. While existing research addresses transfer learning under various distribution shifts in independent settings, handling dependencies in networked data remains challenging. To address this challenge, we propose a high-dimensional transfer learning framework based on network convolutional regression (NCR), inspired by the success of graph convolutional networks (GCNs). The NCR model incorporates random network structure by allowing each node's response to depend on its features and the aggregated features of its neighbors, capturing local dependencies effectively. Our methodology includes a two-step transfer learning algorithm that addresses domain shift between source and target networks, along with a source detection mechanism to identify informative domains. Theoretically, we analyze the lasso estimator in the context of a random graph based on the Erdos-Renyi model assumption, demonstrating that transfer learning improves convergence rates when informative sources are present. Empirical evaluations, including simulations and a real-world application using Sina Weibo data, demonstrate substantial improvements in prediction accuracy, particularly when labeled data in the target domain is limited.