Meta-learning optimizes predictions of missing links in real-world networks

📅 2025-08-12

📈 Citations: 0

✨ Influential: 0

career value

238K/year

🤖 AI Summary

In real-world networks lacking node attributes, selecting an optimal link prediction algorithm critically depends on network topology, yet existing approaches lack systematic guidance for algorithm selection. Method: This paper proposes a meta-learning-based adaptive algorithm selection framework. It extracts scalable structural features—including degree distribution, triangle density, and assortativity—and establishes a unified benchmark comprising 42 topological predictors, 4 stacking strategies, 2 GNNs, and Random Forest, evaluated across 550 real-world networks. Crucially, link prediction is reformulated as a meta-learning task: “network structure → optimal algorithm,” enabling dynamic adaptation. Contribution/Results: Experiments demonstrate that the proposed method significantly outperforms state-of-the-art baselines in both AUC and Top-k metrics, especially on economic and biological networks. It exhibits strong generalizability across diverse domains and scalability to large-scale networks, offering a principled, data-driven solution for topology-aware algorithm selection.

Technology Category

Application Category

📝 Abstract

Relational data are ubiquitous in real-world data applications, e.g., in social network analysis or biological modeling, but networks are nearly always incompletely observed. The state-of-the-art for predicting missing links in the hard case of a network without node attributes uses model stacking or neural network techniques. It remains unknown which approach is best, and whether or how the best choice of algorithm depends on the input network's characteristics. We answer these questions systematically using a large, structurally diverse benchmark of 550 real-world networks under two standard accuracy measures (AUC and Top-k), comparing four stacking algorithms with 42 topological link predictors, two of which we introduce here, and two graph neural network algorithms. We show that no algorithm is best across all input networks, all algorithms perform well on most social networks, and few perform well on economic and biological networks. Overall, model stacking with a random forest is both highly scalable and surpasses on AUC or is competitive with graph neural networks on Top-k accuracy. But, algorithm performance depends strongly on network characteristics like the degree distribution, triangle density, and degree assortativity. We introduce a meta-learning algorithm that exploits this variability to optimize link predictions for individual networks by selecting the best algorithm to apply, which we show outperforms all state-of-the-art algorithms and scales to large networks.

Problem

Research questions and friction points this paper is trying to address.

Determining the best algorithm for predicting missing links in networks

Assessing how algorithm performance depends on network characteristics

Developing a meta-learning approach to optimize link prediction accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-learning optimizes link prediction algorithm selection

Random forest model stacking enhances scalability and accuracy

Performance depends on network degree and triangle density

🔎 Similar Papers

No similar papers found.

Altos Labs

Scientist I, Machine Learning: $200,900 - $257,500 Scientist II, Machine Learning: $226,200 - $290,000 Senior Scientist I, Machine Learning: $257,400 - $330,000 (Redwood City, CA) Scientist I, Machine Learning: $179,400 - $230,000 Scientist II, Machine Learning: $212,900 - $273,000 Senior Scientist I, Machine Learning: $239,500 - $307,000 (San Diego, CA)

Redwood City, CA, USA / San Diego, CA, USA

Machine Learning Engineer