Intrinsic Verification of Parsers and Formal Grammar Theory in Dependent Lambek Calculus (Extended Version)

📅 2025-04-04

📈 Citations: 0

✨ Influential: 0

career value

148K/year

🤖 AI Summary

This paper addresses the problem of verifying the intrinsic correctness of formal grammar parsers. We propose a novel typed approach based on a dependency-aware variant of the Lambek calculus, which integrates linear and dependent types to encode grammatical structure—such as regular and context-free grammars—directly at the type level. Consequently, parser implementations are realized as type-correct linear terms, ensuring by construction that only valid parse trees are generated. Our key contributions are threefold: (1) the first application of a linear dependent type system to formal grammar modeling and parser verification; (2) a unified semantic framework capturing diverse grammar classes—including regular, context-free, and their corresponding automata—as instances of the same calculus; and (3) a prototype implementation in Agda, which successfully verifies the correctness of regex parsers and multiple context-free grammar parsers. The approach bridges formal linguistics, type theory, and verified parsing, enabling syntax-directed, proof-carrying parser development.

Technology Category

Application Category

📝 Abstract

We present Dependent Lambek Calculus, a domain-specific dependent type theory for verified parsing and formal grammar theory. In Dependent Lambek Calculus, linear types are used as a syntax for formal grammars, and parsers can be written as linear terms. The linear typing restriction provides a form of intrinsic verification that a parser yields only valid parse trees for the input string. We demonstrate the expressivity of this system by showing that the combination of inductive linear types and dependency on non-linear data can be used to encode commonly used grammar formalisms such as regular and context-free grammars as well as traces of various types of automata. Using these encodings, we define parsers for regular expressions using deterministic automata, as well as examples of verified parsers of context-free grammars. We present a denotational semantics of our type theory that interprets the types as a mathematical notion of formal grammars. Based on this denotational semantics, we have made a prototype implementation of Dependent Lambek Calculus using a shallow embedding in the Agda proof assistant. All of our examples parsers have been implemented in this prototype implementation.

Problem

Research questions and friction points this paper is trying to address.

Develops Dependent Lambek Calculus for verified parsing

Encodes grammar formalisms using linear types

Provides denotational semantics for formal grammars

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dependent Lambek Calculus for verified parsing

Linear types ensure parser validity intrinsically

Shallow Agda embedding for prototype implementation

🔎 Similar Papers

No similar papers found.