🤖 AI Summary
Existing neural graph databases (NGDBs) support only single-graph operations, limiting their ability to model multi-source heterogeneous graph data; moreover, direct cross-source aggregation of sensitive graph data poses severe privacy risks. To address this, we propose the first NGDB framework enabling privacy-preserving multi-source collaborative inference. Our approach deeply integrates federated learning with graph neural networks (GNNs) to achieve cross-source structural representation learning while keeping raw data strictly local. We design a privacy-aware distributed optimization mechanism that enforces strict data localization and regulatory compliance. The framework ensures end-to-end privacy security—via formal privacy guarantees—while significantly improving cross-source graph query accuracy and downstream task performance. This work overcomes the longstanding technical bottleneck wherein joint multi-graph modeling and strong privacy protection were considered mutually exclusive.
📝 Abstract
The increasing demand for large-scale language models (LLMs) has highlighted the importance of efficient data retrieval mechanisms. Neural graph databases (NGDBs) have emerged as a promising approach to storing and querying graph-structured data in neural space, enabling the retrieval of relevant information for LLMs. However, existing NGDBs are typically designed to operate on a single graph, limiting their ability to reason across multiple graphs. Furthermore, the lack of support for multi-source graph data in existing NGDBs hinders their ability to capture the complexity and diversity of real-world data. In many applications, data is distributed across multiple sources, and the ability to reason across these sources is crucial for making informed decisions. This limitation is particularly problematic when dealing with sensitive graph data, as directly sharing and aggregating such data poses significant privacy risks. As a result, many applications that rely on NGDBs are forced to choose between compromising data privacy or sacrificing the ability to reason across multiple graphs. To address these limitations, we propose Federated Neural Graph Database (FedNGDB), a novel framework that enables reasoning over multi-source graph-based data while preserving privacy. FedNGDB leverages federated learning to collaboratively learn graph representations across multiple sources, enriching relationships between entities and improving the overall quality of the graph data. Unlike existing methods, FedNGDB can handle complex graph structures and relationships, making it suitable for various downstream tasks.