RELATE: A Schema-Agnostic Perceiver Encoder for Multimodal Relational Graphs

📅 2025-10-22

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

Existing graph neural networks (GNNs) require dedicated encoders for each node type and feature column when processing multimodal relational graphs, leading to parameter redundancy and poor generalization. Method: We propose the first generic feature encoding framework for heterogeneous temporal graphs, introducing the Perceiver architecture—previously unexplored in relational graph modeling—to jointly encode categorical, numerical, textual, and temporal attributes while enabling cross-modal parameter sharing and preserving permutation invariance. The framework is compatible with mainstream GNN backbones (e.g., ReLGNN, HGT) and trained end-to-end on the RelBench benchmark. Contribution/Results: Our approach matches the performance of specialized encoders (within ≤3% error), reduces parameters by up to 5×, and supports multi-dataset pretraining. It establishes a scalable, lightweight, and unified encoding paradigm for foundational models on relational graphs.

Technology Category

Application Category

📝 Abstract

Relational multi-table data is common in domains such as e-commerce, healthcare, and scientific research, and can be naturally represented as heterogeneous temporal graphs with multi-modal node attributes. Existing graph neural networks (GNNs) rely on schema-specific feature encoders, requiring separate modules for each node type and feature column, which hinders scalability and parameter sharing. We introduce RELATE (Relational Encoder for Latent Aggregation of Typed Entities), a schema-agnostic, plug-and-play feature encoder that can be used with any general purpose GNN. RELATE employs shared modality-specific encoders for categorical, numerical, textual, and temporal attributes, followed by a Perceiver-style cross-attention module that aggregates features into a fixed-size, permutation-invariant node representation. We evaluate RELATE on ReLGNN and HGT in the RelBench benchmark, where it achieves performance within 3% of schema-specific encoders while reducing parameter counts by up to 5x. This design supports varying schemas and enables multi-dataset pretraining for general-purpose GNNs, paving the way toward foundation models for relational graph data.

Problem

Research questions and friction points this paper is trying to address.

Handles multimodal relational graphs with heterogeneous data

Eliminates schema-specific encoders for scalable graph processing

Enables parameter sharing across varying graph schemas

Innovation

Methods, ideas, or system contributions that make the work stand out.

Schema-agnostic encoder for multimodal relational graphs

Shared modality-specific encoders with Perceiver cross-attention

Fixed-size permutation-invariant node representation aggregation

🔎 Similar Papers

Disentangling and Integrating Relational and Sensory Information in Transformer Architectures