Algebraic Databases

📅 2016-02-10
🏛️ arXiv.org
📈 Citations: 32
Influential: 1
📄 PDF
🤖 AI Summary
Traditional database models, grounded in the set-valued functor paradigm, lack native support for algebraic operations—such as numerical comparison and arithmetic—and exhibit a fundamental semantic and computational gap with programming languages. To address this, we propose an algebraic database model that systematically embeds multiple Lawvere theories into a unified categorical semantics framework, thereby coherently formalizing schemas, instances, schema transformations, and queries. Leveraging a proarrow equipment—a double-categorical structure—we integrate all model components, enabling direct expression and execution of algebraic operations (e.g., addition, order comparison) within data constraints and queries. This approach bridges the foundational disconnect between database theory and programming language semantics, yielding a verifiable algebraic semantics for databases and establishing computational completeness.
📝 Abstract
Databases have been studied category-theoretically for decades. The database schema -- whose purpose is to arrange high-level conceptual entities -- is generally modeled as a category or sketch. The data itself, often called an instance, is generally modeled as a set-valued functor, assigning to each conceptual entity a set of examples. While mathematically elegant, these categorical models have typically struggled with representing concrete data such as integers or strings. In the present work, we propose an extension of the set-valued functor model, making use of multisorted algebraic theories (a.k.a. Lawvere theories) to incorporate concrete data in a principled way. This also allows constraints and queries to make use of operations on data, such as multiplication or comparison of numbers, helping to bridge the gap between traditional databases and programming languages. We also show how all of the components of our model -- including schemas, instances, change-of-schema functors, and queries - fit into a single double categorical structure called a proarrow equipment (a.k.a. framed bicategory).
Problem

Research questions and friction points this paper is trying to address.

Traditional Databases
Value Functor
Functional Gap
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-sorted Algebra
Proarrow Equipments
Database Integration