Transfer Learning with Network Embeddings under Structured Missingness

📅 2026-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Cross-site data migration is often hindered by feature inconsistency and population heterogeneity, which impede effective knowledge transfer. This work proposes TransNEST, a novel framework that, for the first time, integrates structured missingness patterns with network embedding. By jointly leveraging graph structures and prior grouping information from both source and target domains, TransNEST adaptively models intra-group heterogeneity and inter-site discrepancies. The framework further enhances embedding quality through group-wise regularization and a hierarchical ontology structure. Evaluated under limited feature overlap and small-sample settings, TransNEST significantly outperforms existing methods, demonstrating its efficacy and practical utility in a multicenter pediatric electronic health record study where it accurately identifies specific relational pairs.

Technology Category

Application Category

📝 Abstract
Modern data-driven applications increasingly rely on large, heterogeneous datasets collected across multiple sites. Differences in data availability, feature representation, and underlying populations often induce structured missingness, complicating efforts to transfer information from data-rich settings to those with limited data. Many transfer learning methods overlook this structure, limiting their ability to capture meaningful relationships across sites. We propose TransNEST (Transfer learning with Network Embeddings under STructured missingness), a framework that integrates graphical data from source and target sites with prior group structure to construct and refine network embeddings. TransNEST accommodates site-specific features, captures within-group heterogeneity and between-site differences adaptively, and improves embedding estimation under partial feature overlap. We establish the convergence rate for the TransNEST estimator and demonstrate strong finite-sample performance in simulations. We apply TransNEST to a multi-site electronic health record study, transferring feature embeddings from a general hospital system to a pediatric hospital system. Using a hierarchical ontology structure, TransNEST improves pediatric embeddings and supports more accurate pediatric knowledge extraction, achieving the best accuracy for identifying pediatric-specific relational feature pairs compared with benchmark methods.
Problem

Research questions and friction points this paper is trying to address.

transfer learning
structured missingness
network embeddings
multi-site data
feature heterogeneity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transfer Learning
Network Embeddings
Structured Missingness
Multi-site Data
Hierarchical Ontology
🔎 Similar Papers
No similar papers found.
M
Mengyan Li
Department of Mathematical Sciences, Bentley University, Waltham, USA
Xiaoou Li
Xiaoou Li
University of Minnesota
Latent Variable ModelsSequential MethodsPsychometrics
K
Kenneth D Mandl
Harvard Medical School, Boston, USA
Tianxi Cai
Tianxi Cai
Harvard University
statisticsbiostatisticsmodelingpredictiongenomics