Relational Deep Learning: Challenges, Foundations and Next-Generation Architectures

๐Ÿ“… 2025-06-19
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
End-to-end representation learning for multi-table relational databases remains challenging due to the need for manual feature engineering and the lack of unified structural abstractions across heterogeneous, time-evolving tables. Method: We propose Relational Deep Learning (RDL), a framework that bypasses traditional feature engineering by formally introducing the *temporal heterogeneous relational entity graph*โ€”a unified graph structure where primaryโ€“foreign key relationships define edges, schema constraints govern node/edge types, and timestamps serve as dynamic attributes. RDL integrates graph neural networks (GNNs), relational algebraic modeling, temporal graph learning, and heterogeneous graph architectures to enable cross-table joint representation learning. Contribution: This work establishes the first theoretical foundation and technical roadmap for RDL, systematically identifying core challenges and curating benchmark datasets. It advances graph representation learning toward relational data foundation models and introduces a novel paradigm for large-scale, multi-table joint modeling.

Technology Category

Application Category

๐Ÿ“ Abstract
Graph machine learning has led to a significant increase in the capabilities of models that learn on arbitrary graph-structured data and has been applied to molecules, social networks, recommendation systems, and transportation, among other domains. Data in multi-tabular relational databases can also be constructed as'relational entity graphs'for Relational Deep Learning (RDL) - a new blueprint that enables end-to-end representation learning without traditional feature engineering. Compared to arbitrary graph-structured data, relational entity graphs have key properties: (i) their structure is defined by primary-foreign key relationships between entities in different tables, (ii) the structural connectivity is a function of the relational schema defining a database, and (iii) the graph connectivity is temporal and heterogeneous in nature. In this paper, we provide a comprehensive review of RDL by first introducing the representation of relational databases as relational entity graphs, and then reviewing public benchmark datasets that have been used to develop and evaluate recent GNN-based RDL models. We discuss key challenges including large-scale multi-table integration and the complexities of modeling temporal dynamics and heterogeneous data, while also surveying foundational neural network methods and recent architectural advances specialized for relational entity graphs. Finally, we explore opportunities to unify these distinct modeling challenges, highlighting how RDL converges multiple sub-fields in graph machine learning towards the design of foundation models that can transform the processing of relational data.
Problem

Research questions and friction points this paper is trying to address.

Representing relational databases as graphs for deep learning
Addressing challenges in multi-table integration and temporal dynamics
Developing foundation models for relational data processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Relational entity graphs for end-to-end learning
GNN-based models for relational databases
Temporal and heterogeneous graph connectivity