🤖 AI Summary
Temporal Graph Neural Networks (TGNNs) suffer from inefficient design, unsystematic evaluation, and poorly understood performance limits. Method: We introduce the first unified, modular evaluation framework—requiring over 10,000 GPU hours—to systematically explore the TGNN design space, rigorously assess neighbor sampling and attention-based aggregators, propose a static node memory mechanism effective on highly repetitive dynamic graphs (with theoretical mapping to data repetition patterns), and analyze module interaction under diverse temporal data patterns. Contributions/Results: (1) Neighbor sampling combined with attention aggregation significantly outperforms conventional approaches; (2) Our standardized benchmark and cross-model evaluation eliminate prior assessment biases; (3) We establish a reproducible, data-driven optimization paradigm for TGNN design, revealing that module synergy is strongly governed by temporal data characteristics. Empirical validation confirms both the efficacy of static memory under repetition and the critical role of temporal structure in architectural choices.
📝 Abstract
Temporal Graph Neural Networks (TGNNs) have emerged as powerful tools for modeling dynamic interactions across various domains. The design space of TGNNs is notably complex, given the unique challenges in runtime efficiency and scalability raised by the evolving nature of temporal graphs. We contend that many of the existing works on TGNN modeling inadequately explore the design space, leading to suboptimal designs. Viewing TGNN models through a performance-focused lens often obstructs a deeper understanding of the advantages and disadvantages of each technique. Specifically, benchmarking efforts inherently evaluate models in their original designs and implementations, resulting in unclear accuracy comparisons and misleading runtime. To address these shortcomings, we propose a practical comparative evaluation framework that performs a design space search across well-known TGNN modules based on a unified, optimized code implementation. Using our framework, we make the first efforts towards addressing three critical questions in TGNN design, spending over 10,000 GPU hours: (1) investigating the efficiency of TGNN module designs, (2) analyzing how the effectiveness of these modules correlates with dataset patterns, and (3) exploring the interplay between multiple modules. Key outcomes of this directed investigative approach include demonstrating that the most recent neighbor sampling and attention aggregator outperform uniform neighbor sampling and MLP-Mixer aggregator; Assessing static node memory as an effective node memory alternative, and showing that the choice between static or dynamic node memory should be based on the repetition patterns in the dataset. Our in-depth analysis of the interplay between TGNN modules and dataset patterns should provide a deeper insight into TGNN performance along with potential research directions for designing more general and effective TGNNs.