RelBench v2: A Large-Scale Benchmark and Repository for Relational Data

📅 2026-02-13
📈 Citations: 0
Influential: 0
📄 PDF

Technology Category

Application Category

📝 Abstract
Relational deep learning (RDL) has emerged as a powerful paradigm for learning directly on relational databases by modeling entities and their relationships across multiple interconnected tables. As this paradigm evolves toward larger models and relational foundation models, scalable and realistic benchmarks are essential for enabling systematic evaluation and progress. In this paper, we introduce RelBench v2, a major expansion of the RelBench benchmark for RDL. RelBench v2 adds four large-scale relational datasets spanning scholarly publications, enterprise resource planning, consumer platforms, and clinical records, increasing the benchmark to 11 datasets comprising over 22 million rows across 29 tables. We further introduce autocomplete tasks, a new class of predictive objectives that require models to infer missing attribute values directly within relational tables while respecting temporal constraints, expanding beyond traditional forecasting tasks constructed via SQL queries. In addition, RelBench v2 expands beyond its native datasets by integrating external benchmarks and evaluation frameworks: we translate event streams from the Temporal Graph Benchmark into relational schemas for unified relational-temporal evaluation, interface with ReDeLEx to provide uniform access to 70+ real-world databases suitable for pretraining, and incorporate 4DBInfer datasets and tasks to broaden multi-table prediction coverage. Experimental results demonstrate that RDL models consistently outperform single-table baselines across autocomplete, forecasting, and recommendation tasks, highlighting the importance of modeling relational structure explicitly.
Problem

Research questions and friction points this paper is trying to address.

relational deep learning
benchmark
relational databases
foundation models
evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

relational deep learning
benchmark
autocomplete tasks
temporal constraints
multi-table prediction
🔎 Similar Papers
No similar papers found.
J
Justin Gu
Stanford University
R
Rishabh Ranjan
Stanford University
Charilaos Kanatsoulis
Charilaos Kanatsoulis
Research Associate, Stanford University
Machine LearningSignal ProcessingGraph Deep LearningTensor Models
H
Haiming Tang
National University of Singapore
M
Martin Jurkovic
University of Ljubljana
Valter Hudovernik
Valter Hudovernik
University of Ljubljana
machine learningdeep learningsynthetic data
M
Mark Znidar
University of Oxford
P
Pranshu Chaturvedi
Stanford University
P
Parth Shroff
Stanford University
F
Fengyu Li
Stanford University
Jure Leskovec
Jure Leskovec
Professor of Computer Science, Stanford University
Data miningMachine LearningGraph Neural NetworksKnowledge GraphsComplex Networks