🤖 AI Summary
Natural language queries over multi-model databases often yield infeasible intermediate logical plans due to operator scoping ambiguities and predicate semantic vagueness, leading to type mismatches, missing bindings, or constraint violations. To address this, this work introduces the first extension of packed parse forests to multi-model query scenarios, proposing a polynomially bounded Packed Plan Forest (PPF) structure. PPF detects local inconsistencies through feasibility constraints and compactly encodes all viable plans using annotated operators. Theoretical analysis and experiments demonstrate that PPF efficiently represents an exponential number of feasible plans within polynomial space, substantially reducing planning overhead and establishing a scalable foundation for compiling natural language queries into executable database operations in heterogeneous systems.
📝 Abstract
Natural language (NL) interfaces to databases broaden access to heterogeneous data but often yield many ambiguous intermediate logical plans (ILPs) due to uncertain operator scope and predicate semantics. Many candidates are infeasible because of type mismatches, missing bindings, or engine-specific constraints. We address this challenge with \emph{feasibility constraints} for detecting local inconsistencies and introduce the Packed Plan Forest (PPF) a polynomially bounded structure that compactly encodes all feasible ILPs while pruning infeasible ones early. Extending packed parse forest ideas to multi-model settings, PPF supports efficient feasibility analysis through annotated operators. Formal results show polynomial size under bounded arity and annotation vocabularies, and experiments confirm that PPFs capture exponentially many ILPs with minimal overhead, establishing a scalable foundation for NL-to-DB query planning across heterogeneous systems