🤖 AI Summary
Multi-model databases—supporting heterogeneous data models (e.g., relational, graph, hierarchical)—lack a unified theoretical foundation for query processing.
Method: This work pioneers the systematic application of category theory to query modeling, establishing the first formal query-theoretic framework for multi-model data. It introduces two equivalent expressive query languages—categorical calculus and categorical algebra—and provides rigorous formal proofs of their equivalence. A set of provably correct categorical algebraic optimization rules is designed, grounding query equivalence checking and optimization in categorical semantics.
Contributions/Results: The framework precisely characterizes its expressive boundaries and proves that core query decision problems are polynomial-time solvable. By transcending the traditional single-model isolation paradigm, this work delivers the first category-theoretic query theory for multi-model databases—one that simultaneously ensures high expressiveness and formal verifiability.
📝 Abstract
Multi-model databases are designed to store, manage, and query data in various models, such as relational, hierarchical, and graph data, simultaneously. In this paper, we provide a theoretical basis for querying categorical databases. We propose two formal query languages: categorical calculus and categorical algebra, by extending relational calculus and relational algebra respectively. We demonstrate the equivalence between these two languages of queries. We propose a series of transformation rules of categorical algebra to facilitate query optimization. Finally, we analyze the expressive power and computation complexity for the proposed query languages.