Domain matters: Towards domain-informed evaluation for link prediction

📅 2025-12-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

218K/year
🤖 AI Summary
Conventional link prediction evaluation often assumes consistent cross-domain algorithmic ranking, ignoring domain-specific generative mechanisms and semantic heterogeneity—leading to erroneous identification of “universally optimal” algorithms. Method: We conduct a systematic evaluation of 12 state-of-the-art algorithms across 740 real-world networks spanning seven domains (social, biological, transportation, etc.), proposing the first “domain-aware evaluation” paradigm and introducing the Winner Score to quantify domain-specific SOTA performance. Results: We find strong inter-domain ranking inconsistency (mean Kendall τ < 0.2) but high intra-domain stability (τ > 0.85); moreover, top-performing algorithms align closely with domain-specific network generation principles (e.g., NMF for social networks, L3-RA for biological networks). We release the first cross-domain benchmark framework for link prediction, enabling principled, domain-adaptive algorithm selection.

Technology Category

Application Category

📝 Abstract
Link prediction, a foundational task in complex network analysis, has extensive applications in critical scenarios such as social recommendation, drug target discovery, and knowledge graph completion. However, existing evaluations of algorithmic often rely on experiments conducted on a limited number of networks, assuming consistent performance rankings across domains. Despite the significant disparities in generative mechanisms and semantic contexts, previous studies often improperly highlight ``universally optimal" algorithms based solely on naive average over networks across domains. This paper systematically evaluates 12 mainstream link prediction algorithms across 740 real-world networks spanning seven domains. We present substantial empirical evidence elucidating the performance of algorithms in specific domains. This findings reveal a notably low degree of consistency in inter-domain algorithm rankings, a phenomenon that stands in stark contrast to the high degree of consistency observed within individual domains. Principal Component Analysis shows that response vectors formed by the rankings of the 12 algorithms cluster distinctly by domain in low-dimensional space, thus confirming domain attributes as a pivotal factor affecting algorithm performance. We propose a metric called Winner Score that could identify the superior algorithm in each domain: Non-Negative Matrix Factorization for social networks, Neighborhood Overlap-aware Graph Neural Networks for economics, Graph Convolutional Networks for chemistry, and L3-based Resource Allocation for biology. However, these domain-specific top-performing algorithms tend to exhibit suboptimal performance in other domains. This finding underscores the importance of aligning an algorithm's mechanism with the network structure.
Problem

Research questions and friction points this paper is trying to address.

Evaluates link prediction algorithms across diverse real-world domains.
Reveals low consistency in algorithm rankings between different domains.
Proposes domain-specific metrics to identify optimal algorithms per domain.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluated 12 algorithms across 740 networks spanning seven domains
Proposed Winner Score metric to identify superior domain-specific algorithms
Used Principal Component Analysis to confirm domain attributes affect performance
🔎 Similar Papers
No similar papers found.