Graph-centric Cross-model Data Integration and Analytics in a Unified Multi-model Database

📅 2026-03-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing hybrid-model databases struggle to efficiently support graph-centric cross-model data integration and analysis due to the absence of global optimization and high-performance execution mechanisms. This work proposes GredoDB, a natively unified multi-model database that seamlessly integrates graph, relational, and document data models. GredoDB introduces three key innovations: topology- and attribute-aware graph operators, a unified cross-model query optimization framework, and a parallel, operator-level execution architecture with intermediate result materialization. Evaluated on the M2Bench benchmark, GredoDB achieves an average speedup of 10.89× (up to 107.89×) on data integration tasks and 37.79× (up to 356.72×) on analytical workloads, substantially overcoming the performance limitations of conventional multi-model databases.

Technology Category

Application Category

📝 Abstract
Graph-centric cross-model data integration and analytics (GCDIA) refer to tasks that leverage the graph model as a central paradigm to integrate relevant information across heterogeneous data models, such as relational and document, and subsequently perform complex analytics such as regression and similarity computation. As modern applications generate increasingly diverse data and move beyond simple retrieval toward advanced analytical objectives (e.g., prediction and recommendation), GCDIA has become increasingly important. Existing multi-model databases (MMDBs) struggle to efficiently support both integration (GCDI) and analytics (GCDA) in GCDIA. They typically separate graph processing from other models without global optimization for GCDI, while relying on tuple-at-a-time execution for GCDA, leading to limited performance and scalability. To address these limitations, we propose GredoDB, a unified MMDB that natively supports storing graph, relational, and document models, while efficiently processing GCDIA. Specifically, we design 1) topology- and attribute-aware graph operators for efficient predicate-aware traversal, 2) a unified GCDI optimization framework to exploit cross-model correlations, and 3) a parallel GCDA architecture that materializes intermediate results for operator-level execution. Experiments on the widely adopted multi-model benchmark M2Bench demonstrate that, in terms of response time, GredoDB achieves up to 107.89 times and an average of 10.89 times speedup on GCDI, and up to 356.72 times and an average of 37.79 times on GCDA, compared to state-of-the-art (SOTA) MMDBs.
Problem

Research questions and friction points this paper is trying to address.

graph-centric
cross-model data integration
multi-model database
data analytics
performance scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

graph-centric integration
multi-model database
cross-model optimization
parallel graph analytics
unified data model
🔎 Similar Papers
No similar papers found.
Zepeng Liu
Zepeng Liu
Wuhan University
databaseclustering
Sheng Wang
Sheng Wang
Wuhan University
Nano-opticsOptoelectronic devicesSPMLow-dimensional materials
Shixun Huang
Shixun Huang
University of Wollongong
data mininggraph databasesmachine learningalgorithms
H
Hailang Qiu
School of Computer Science, Wuhan University, Wuhan, China
Y
Yuwei Peng
School of Computer Science, Wuhan University, Wuhan, China
J
Jiale Feng
School of Computer Science, Wuhan University, Wuhan, China
S
Shunan Liao
School of Computer Science, Wuhan University, Wuhan, China
Yushuai Ji
Yushuai Ji
Wuhan University
Vector SearchClustering AlgorithmVector Database
Z
Zhiyong Peng
School of Computer Science, Wuhan University, Wuhan, China; Big Data Institute, Wuhan University, Wuhan, China