🤖 AI Summary
Existing graph neural networks (GNNs) require dedicated encoders for each node type and feature column when processing multimodal relational graphs, leading to parameter redundancy and poor generalization. Method: We propose the first generic feature encoding framework for heterogeneous temporal graphs, introducing the Perceiver architecture—previously unexplored in relational graph modeling—to jointly encode categorical, numerical, textual, and temporal attributes while enabling cross-modal parameter sharing and preserving permutation invariance. The framework is compatible with mainstream GNN backbones (e.g., ReLGNN, HGT) and trained end-to-end on the RelBench benchmark. Contribution/Results: Our approach matches the performance of specialized encoders (within ≤3% error), reduces parameters by up to 5×, and supports multi-dataset pretraining. It establishes a scalable, lightweight, and unified encoding paradigm for foundational models on relational graphs.
📝 Abstract
Relational multi-table data is common in domains such as e-commerce, healthcare, and scientific research, and can be naturally represented as heterogeneous temporal graphs with multi-modal node attributes. Existing graph neural networks (GNNs) rely on schema-specific feature encoders, requiring separate modules for each node type and feature column, which hinders scalability and parameter sharing. We introduce RELATE (Relational Encoder for Latent Aggregation of Typed Entities), a schema-agnostic, plug-and-play feature encoder that can be used with any general purpose GNN. RELATE employs shared modality-specific encoders for categorical, numerical, textual, and temporal attributes, followed by a Perceiver-style cross-attention module that aggregates features into a fixed-size, permutation-invariant node representation. We evaluate RELATE on ReLGNN and HGT in the RelBench benchmark, where it achieves performance within 3% of schema-specific encoders while reducing parameter counts by up to 5x. This design supports varying schemas and enables multi-dataset pretraining for general-purpose GNNs, paving the way toward foundation models for relational graph data.