🤖 AI Summary
This study addresses the challenge of inaccurate estimation of connection probability matrices in sparse target networks by introducing transfer learning into the degree-corrected mixed membership (DCMM) model for the first time. The proposed framework leverages knowledge from information-rich source networks to enhance estimation accuracy, while employing random projections to reduce computational complexity. An iterative truncation algorithm is designed to selectively incorporate beneficial source data, thereby mitigating negative transfer. Theoretical analysis demonstrates that the transfer gain arises from an enlarged eigenvalue gap in the target’s connection matrix. Extensive experiments on journal citation and international trade networks confirm the framework’s superior performance in estimation accuracy, computational efficiency, and robustness.
📝 Abstract
Statistical analysis of network data has attracted considerable attention in recent years, due to the rapid advancement of well-trained network models and the accessibility of large public network datasets. In this article, we propose a transfer learning procedure for boosting estimation accuracy of a target network structure based on the well-known Degree-Corrected Mixed-Membership (DCMM) model in the literature. By leveraging useful information from informative source datasets, we theoretically prove that the transfer learning procedure greatly improve the estimation accuracy for the target connection probability matrix. Our theoretical analysis also reveals that the benefits from knowledge transfer in this context attributes to the enlarged eigenvalue gap of the target connection probability matrix. Additionally, we propose a random projection step in conjunction with the conventional aggregation procedure to alleviate the heavy computational burden in practice. In the presence of potentially harmful sources, we further provide an iterative truncation algorithm for selecting useful datasets and avoiding negative transfer. Numerical results showcase the practical utility of our methods in real-world network dataset analysis, including journal citation network dataset and international trade network dataset.