🤖 AI Summary
Addressing the challenges inherent in brain connectomic networks—subject specificity, high dimensionality with sparsity, and absence of node- or edge-level covariates—this paper proposes Adaptive Contrastive Edge Representation Learning (ACERL). ACERL employs a data-driven stochastic edge masking mechanism to generate augmented network pairs and jointly optimizes edge-level representations within a contrastive learning framework. It establishes, for the first time, the minimax-optimal convergence rate for edge representation learning and derives non-asymptotic error bounds. By integrating theoretical guarantees with sparse network modeling, ACERL ensures high-fidelity edge embeddings. Experiments on synthetic and real-world brain connectomic datasets demonstrate that ACERL significantly outperforms baselines—including sparse PCA—across downstream tasks such as subject classification, edge prediction, and community detection. The method achieves strong generalization performance while maintaining rigorous theoretical foundations.
📝 Abstract
Network representation learning seeks to embed networks into a low-dimensional space while preserving the structural and semantic properties, thereby facilitating downstream tasks such as classification, trait prediction, edge identification, and community detection. Motivated by challenges in brain connectivity data analysis that is characterized by subject-specific, high-dimensional, and sparse networks that lack node or edge covariates, we propose a novel contrastive learning-based statistical approach for network edge embedding, which we name as Adaptive Contrastive Edge Representation Learning (ACERL). It builds on two key components: contrastive learning of augmented network pairs, and a data-driven adaptive random masking mechanism. We establish the non-asymptotic error bounds, and show that our method achieves the minimax optimal convergence rate for edge representation learning. We further demonstrate the applicability of the learned representation in multiple downstream tasks, including network classification, important edge detection, and community detection, and establish the corresponding theoretical guarantees. We validate our method through both synthetic data and real brain connectivities studies, and show its competitive performance compared to the baseline method of sparse principal components analysis.