ConRAD: Conformal Risk-Aware Neural Databases

📅 2026-05-05

📈 Citations: 0

✨ Influential: 0

career value

139K/year

🤖 AI Summary

This work addresses the challenge of neural multi-hop querying over incomplete knowledge graphs, where existing methods often suffer from error propagation and lack formal recall guarantees. The authors propose the first neural graph database query framework that supports declarative recall guarantees: given a user-specified risk budget, it employs conformal risk control to automatically calibrate operator thresholds, achieving distribution-free statistical validity with finite samples. By introducing quantized spatial scalarization, the high-dimensional threshold optimization is reduced to a single-parameter search, and a conformal gating operator is designed to dynamically skip redundant inference steps. Experiments across three benchmarks and diverse query topologies demonstrate that the method strictly adheres to the prescribed risk budget—empirical recall falls short of the target by at most 0.046—eliminates neural calls entirely in near-complete graph regions, and matches or exceeds the accuracy of manually tuned baselines lacking formal guarantees.

📝 Abstract

Querying incomplete knowledge graphs with neural predictors is powerful but dangerous. Errors compound across multi-hop pipelines with no formal bound on the completeness of results. We introduce ConRAD, the first framework to enforce declarative recall guarantees natively within a neural graph database query engine. Given a user-specified risk budget, ConRAD automatically derives per-operator prediction thresholds that satisfy the recall target with finite-sample, distribution-free statistical validity via Conformal Risk Control, while maximizing end-to-end precision. To scale calibration across multi-operator query topologies, we introduce a quantile-space scalarization that reduces intractable high-dimensional threshold searches to a single parameter. We further design the conformal gate, a novel physical operator that dynamically bypasses neural inference when local graph evidence suffices, eliminating unnecessary model inferences in dense graph regions. Evaluated across three benchmarks and three query topologies, ConRAD strictly satisfies all risk budgets, with empirical recall falling below the target by at most 0.046 across all settings. It reduces neural invocations to zero in near-complete graph regions, and achieves precision that matches or exceeds best-case static baselines that offer no guarantees and require manual threshold search.

Problem

Research questions and friction points this paper is trying to address.

incomplete knowledge graphs

neural predictors

recall guarantees

multi-hop querying

risk control

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conformal Risk Control

Neural Databases

Recall Guarantees