Living Databases: A Unified Model for Continuous Schema Evolution, Versioning, and Transformations

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

Databases continuously evolve through operations such as schema changes, version updates, and data transformations; however, existing approaches typically address these functionalities in isolation, lacking a unified abstraction. This work proposes the first integrated model that unifies continuous schema evolution, version management, and data transformation within a single framework. Built upon general-purpose computational primitives, the model supports operation provenance, conditional update propagation, and change alerts, while employing a declarative mechanism to manage the co-evolution of dependent artifacts—including views and machine learning models. A prototype system implements this framework using an enhanced, parameterized Prolly Tree—a Merkle tree–inspired data structure—to construct a relational-like engine. Experimental evaluation demonstrates that the proposed approach is both feasible and offers tunable performance across diverse evolution scenarios.

📝 Abstract

Databases, and datasets more generally, evolve continuously through updates, transformations, versioning, schema changes, streaming operations, and other mechanisms. While prior work has noted connections among some of these areas, they have traditionally been studied in isolation, each with its own abstractions, algorithms, and system implementations. In this paper, we argue for unifying these diverse functionalities under a single abstraction and a common set of computational primitives. We present such an abstraction, powerful enough to encompass existing use cases and to support new ones. Going beyond previous approaches, our framework seamlessly integrates provenance tracking for system-visible operations, conditional propagation of updates, and configurable alerts on change events. It also offers a principled treatment of dependent objects such as views and derived artifacts like machine learning models, by providing declarative mechanisms to control their evolution. Finally, we sketch a prototype implementation in a relational-like database system based on an adaptation of the "Prolly Tree", a Merkle tree-inspired data structure with tunable parameters to meet varying performance requirements, and present some initial experimental results.

Problem

Research questions and friction points this paper is trying to address.

schema evolution

data versioning

database transformation

continuous data evolution

unified data model

Innovation

Methods, ideas, or system contributions that make the work stand out.

Living Databases

schema evolution

provenance tracking