Synthesize, Retrieve, and Propagate: A Unified Predictive Modeling Framework for Relational Databases

📅 2025-08-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Structured constraints in relational databases (RDBs) hinder deep learning models from effectively capturing deep, multi-table dependencies; existing approaches rely solely on primary-foreign key relationships to construct unary joins or graphs, neglecting implicit composite semantic relationships across tables. Method: We propose SRP, a unified prediction framework that introduces, for the first time, a **content-based cross-table retrieval mechanism**, jointly leveraging feature synthesis and graph neural network (GNN) message passing to model both atomic and composite dependencies simultaneously. SRP breaks away from conventional join/graph construction paradigms, significantly expanding the model’s receptive field over table structures. Contribution/Results: Experiments on five real-world datasets demonstrate that SRP consistently outperforms state-of-the-art baselines. Ablation studies validate the effectiveness of each component, confirming SRP’s strong generalization capability and practical applicability in industrial settings.

Technology Category

Application Category

📝 Abstract
Relational databases (RDBs) have become the industry standard for storing massive and heterogeneous data. However, despite the widespread use of RDBs across various fields, the inherent structure of relational databases hinders their ability to benefit from flourishing deep learning methods. Previous research has primarily focused on exploiting the unary dependency among multiple tables in a relational database using the primary key - foreign key relationships, either joining multiple tables into a single table or constructing a graph among them, which leaves the implicit composite relations among different tables and a substantial potential of improvement for predictive modeling unexplored. In this paper, we propose SRP, a unified predictive modeling framework that synthesizes features using the unary dependency, retrieves related information to capture the composite dependency, and propagates messages across a constructed graph to learn adjacent patterns for prediction on relation databases. By introducing a new retrieval mechanism into RDB, SRP is designed to fully capture both the unary and the composite dependencies within a relational database, thereby enhancing the receptive field of tabular data prediction. In addition, we conduct a comprehensive analysis on the components of SRP, offering a nuanced understanding of model behaviors and practical guidelines for future applications. Extensive experiments on five real-world datasets demonstrate the effectiveness of SRP and its potential applicability in industrial scenarios. The code is released at https://github.com/NingLi670/SRP.
Problem

Research questions and friction points this paper is trying to address.

Enhancing predictive modeling on relational databases using deep learning
Capturing both unary and composite dependencies in relational data
Improving receptive field for tabular data prediction through retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthesizes features using unary dependency
Retrieves information capturing composite dependencies
Propagates messages across graph for prediction
🔎 Similar Papers
No similar papers found.
N
Ning Li
Shanghai Jiao Tong University
Kounianhua Du
Kounianhua Du
上海交通大学
Data ScienceLarge Language Models
H
Han Zhang
Shanghai Jiao Tong University
Q
Quan Gan
AWS Shanghai AI Lab
M
Minjie Wang
AWS Shanghai AI Lab
David Wipf
David Wipf
Principal Research Scientist, Amazon Web Services
deep generative modelssparse representationsBayesian inferencegraph neural networks
W
Weinan Zhang
Shanghai Jiao Tong University