REDELEX: A Framework for Relational Deep Learning Exploration

📅 2025-06-27

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Existing research lacks a systematic analysis of the relationship between relational deep learning (RDL) model performance and intrinsic characteristics of relational databases (RDBs). Method: We introduce an open-source evaluation framework covering 70+ real-world databases, modeling RDBs uniformly as heterogeneous graphs and integrating diverse graph neural network (GNN) architectures. We benchmark RDL models against traditional SQL query optimizers and embedding-based baselines across multiple query tasks. Contribution/Results: Our study is the first to systematically investigate how model complexity, data scale, and structural properties—such as cardinality, normalization level, and foreign-key density—affect RDL efficacy. Experiments demonstrate RDL’s consistent superiority across query types and identify key database features governing performance, providing empirical guidance for model selection and deployment. We release the first large-scale RDB-GNN benchmark dataset and a fully reproducible evaluation pipeline.

Technology Category

Application Category

📝 Abstract

Relational databases (RDBs) are widely regarded as the gold standard for storing structured information. Consequently, predictive tasks leveraging this data format hold significant application promise. Recently, Relational Deep Learning (RDL) has emerged as a novel paradigm wherein RDBs are conceptualized as graph structures, enabling the application of various graph neural architectures to effectively address these tasks. However, given its novelty, there is a lack of analysis into the relationships between the performance of various RDL models and the characteristics of the underlying RDBs. In this study, we present REDELEX$-$a comprehensive exploration framework for evaluating RDL models of varying complexity on the most diverse collection of over 70 RDBs, which we make available to the community. Benchmarked alongside key representatives of classic methods, we confirm the generally superior performance of RDL while providing insights into the main factors shaping performance, including model complexity, database sizes and their structural properties.

Problem

Research questions and friction points this paper is trying to address.

Analyzes RDL model performance vs. RDB characteristics

Evaluates RDL models on diverse 70+ RDB datasets

Identifies key factors affecting RDL performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Framework for evaluating diverse RDL models

Graph neural architectures on relational databases

Analysis of performance factors in RDL

🔎 Similar Papers

Disentangling and Integrating Relational and Sensory Information in Transformer Architectures