NGDB-Zoo: Towards Efficient and Scalable Neural Graph Databases Training

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Existing neural graph databases suffer from low training efficiency and limited expressiveness due to query-level batching and structure-specific embeddings. This work proposes an operator-level dynamic scheduling training framework that decouples logical operators from query topology, enabling efficient integration of high-dimensional semantic priors from pretrained text encoders through dynamic dataflow scheduling and multi-stream parallel computation. By circumventing I/O bottlenecks and memory overflow, the approach achieves 1.8–6.8× throughput gains across six benchmarks, substantially improves GPU utilization, and effectively alleviates representational friction in hybrid neuro-symbolic reasoning.

Technology Category

Application Category

📝 Abstract

Neural Graph Databases (NGDBs) facilitate complex logical reasoning over incomplete knowledge structures, yet their training efficiency and expressivity are constrained by rigid query-level batching and structure-exclusive embeddings. We present NGDB-Zoo, a unified framework that resolves these bottlenecks by synergizing operator-level training with semantic augmentation. By decoupling logical operators from query topologies, NGDB-Zoo transforms the training loop into a dynamically scheduled data-flow execution, enabling multi-stream parallelism and achieving a $1.8\times$ - $6.8\times$ throughput compared to baselines. Furthermore, we formalize a decoupled architecture to integrate high-dimensional semantic priors from Pre-trained Text Encoders (PTEs) without triggering I/O stalls or memory overflows. Extensive evaluations on six benchmarks, including massive graphs like ogbl-wikikg2 and ATLAS-Wiki, demonstrate that NGDB-Zoo maintains high GPU utilization across diverse logical patterns and significantly mitigates representation friction in hybrid neuro-symbolic reasoning.

Problem

Research questions and friction points this paper is trying to address.

Neural Graph Databases

training efficiency

expressivity

query-level batching

structure-exclusive embeddings

Innovation

Methods, ideas, or system contributions that make the work stand out.

operator-level training

semantic augmentation

decoupled architecture