An Algebraic Foundation for Knowledge Graph Construction (Extended Version)

📅 2025-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing declarative mapping languages for knowledge graphs—such as RML—lack formal semantic foundations, leading to semantic ambiguity, implementation inconsistencies, unverifiable optimizations, and intractable expressiveness analysis. Method: We propose the first language-agnostic mapping algebra framework that uniformly models declarative mappings from heterogeneous data sources to knowledge graphs. This framework provides rigorous, proof-ready semantics for RML and related languages. We further establish a complete set of algebraic rewrite rules and formally prove the translatability of RML into this algebra. Contribution/Results: Our work enables correctness verification and automated optimization of mapping plans. It yields several sound equivalence-preserving optimization rules, advancing the theoretical foundations of mapping languages, facilitating robust tool implementation, and strengthening knowledge graph construction methodologies.

Technology Category

Application Category

📝 Abstract
Although they exist since more than ten years already, have attracted diverse implementations, and have been used successfully in a significant number of applications, declarative mapping languages for constructing knowledge graphs from heterogeneous types of data sources still lack a solid formal foundation. This makes it impossible to introduce implementation and optimization techniques that are provably correct and, in fact, has led to discrepancies between different implementations. Moreover, it precludes studying fundamental properties of different languages (e.g., expressive power). To address this gap, this paper introduces a language-agnostic algebra for capturing mapping definitions. As further contributions, we show that the popular mapping language RML can be translated into our algebra (by which we also provide a formal definition of the semantics of RML) and we prove several algebraic rewriting rules that can be used to optimize mapping plans based on our algebra.
Problem

Research questions and friction points this paper is trying to address.

Lack of formal foundation for declarative mapping languages.
Discrepancies between different knowledge graph implementations.
Need for optimization techniques and expressive power analysis.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces language-agnostic algebra for mappings
Translates RML into formal algebraic framework
Proves algebraic rules for optimizing mapping plans
🔎 Similar Papers
No similar papers found.