RELATE: A Schema-Agnostic Perceiver Encoder for Multimodal Relational Graphs

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing graph neural networks (GNNs) require dedicated encoders for each node type and feature column when processing multimodal relational graphs, leading to parameter redundancy and poor generalization. Method: We propose the first generic feature encoding framework for heterogeneous temporal graphs, introducing the Perceiver architecture—previously unexplored in relational graph modeling—to jointly encode categorical, numerical, textual, and temporal attributes while enabling cross-modal parameter sharing and preserving permutation invariance. The framework is compatible with mainstream GNN backbones (e.g., ReLGNN, HGT) and trained end-to-end on the RelBench benchmark. Contribution/Results: Our approach matches the performance of specialized encoders (within ≤3% error), reduces parameters by up to 5×, and supports multi-dataset pretraining. It establishes a scalable, lightweight, and unified encoding paradigm for foundational models on relational graphs.

Technology Category

Application Category

📝 Abstract
Relational multi-table data is common in domains such as e-commerce, healthcare, and scientific research, and can be naturally represented as heterogeneous temporal graphs with multi-modal node attributes. Existing graph neural networks (GNNs) rely on schema-specific feature encoders, requiring separate modules for each node type and feature column, which hinders scalability and parameter sharing. We introduce RELATE (Relational Encoder for Latent Aggregation of Typed Entities), a schema-agnostic, plug-and-play feature encoder that can be used with any general purpose GNN. RELATE employs shared modality-specific encoders for categorical, numerical, textual, and temporal attributes, followed by a Perceiver-style cross-attention module that aggregates features into a fixed-size, permutation-invariant node representation. We evaluate RELATE on ReLGNN and HGT in the RelBench benchmark, where it achieves performance within 3% of schema-specific encoders while reducing parameter counts by up to 5x. This design supports varying schemas and enables multi-dataset pretraining for general-purpose GNNs, paving the way toward foundation models for relational graph data.
Problem

Research questions and friction points this paper is trying to address.

Handles multimodal relational graphs with heterogeneous data
Eliminates schema-specific encoders for scalable graph processing
Enables parameter sharing across varying graph schemas
Innovation

Methods, ideas, or system contributions that make the work stand out.

Schema-agnostic encoder for multimodal relational graphs
Shared modality-specific encoders with Perceiver cross-attention
Fixed-size permutation-invariant node representation aggregation
J
Joseph Meyer
SAP, Palo Alto, CA, USA
Divyansha Lachi
Divyansha Lachi
Graduate Student, University of Pennsylvania
Geometric Deep LearningReinforcement LearningComputational Neuroscience
R
Reza Mohammadi
SAP, Seattle, WA, USA
R
Roshan Reddy Upendra
SAP, Palo Alto, CA, USA
Eva L. Dyer
Eva L. Dyer
University of Pennsylvania, CIFAR
Computational NeuroscienceMachine LearningSignal ProcessingSelf-Supervised Learning
M
Mark Li
SAP, Seattle, WA, USA
T
Tom Palczewski
SAP, Palo Alto, CA, USA