🤖 AI Summary
This paper identifies a temporal granularity inconsistency in the prevailing batch-wise evaluation paradigm for dynamic link prediction, leading to misaligned time windows and spurious temporal dependencies—causing up to 12.7% AUC estimation bias and undermining model comparability and generalizability. To address this, the authors first systematically characterize this distortion mechanism and then propose a novel time-aware evaluation framework centered on: (i) timestamp-aligned sequential modeling, (ii) dynamic graph neural network adaptation, (iii) temporal sliding-window resampling, and (iv) a counterfactual evaluation protocol. Extensive experiments across multiple benchmark datasets demonstrate that the proposed paradigm substantially mitigates evaluation bias, enhances fair model comparison, and improves cross-scenario generalization. The work provides both theoretical foundations and a practical framework for standardizing evaluation in temporal graph learning.
📝 Abstract
Dynamic link prediction is an important problem often considered in recent works proposing various approaches for learning temporal edge patterns. To assess their efficacy, models are evaluated on benchmark datasets involving continuous-time and discrete-time temporal graphs. However, as we show in this work, the suitability of common batch-oriented evaluation depends on the datasets' characteristics, which can cause multiple issues: For continuous-time temporal graphs, fixed-size batches create time windows with different durations, resulting in an inconsistent dynamic link prediction task. For discrete-time temporal graphs, the sequence of batches can additionally introduce temporal dependencies that are not present in the data. In this work, we empirically show that this common evaluation approach leads to skewed model performance and hinders the fair comparison of methods. We mitigate this problem by reformulating dynamic link prediction as a link forecasting task that better accounts for temporal information present in the data.