From Legal Text to Executable Decision Models: Evaluating Structured Representations for Legal Decision Model Generation

📅 2026-04-18

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This work addresses the longstanding challenge of automatically translating legal texts into executable decision logic, which has traditionally relied on manual encoding and evaluation. The authors propose enhancing large language models by introducing an intermediate structured representation and present the first systematic assessment—based on real-world data from the Dutch Environmental Planning Act—of how input/output constraints and semantic role labeling influence the structural and functional equivalence of generated logic. Experimental results demonstrate that incorporating I/O constraints improves structural similarity by 37–54%, achieves functional equivalence in 51–53% of test cases, and automatically eliminates 45–55% of redundant logic nodes, thereby revealing a notable inconsistency between structural similarity and functional equivalence.

Technology Category

Application Category

📝 Abstract

Transforming legal text into executable decision logic is a longstanding challenge in legal informatics. With the rise of LLMs, this task has gained renewed interest, but remains challenging due to requiring extensive manual coding and evaluation. We use a unique real-world dataset that pairs production-grade decision models with legal text from the Dutch Environment and Planning Act. These models power the Omgevingsloket government platform, where citizens check permit requirements for environmental activities. We study whether intermediate structured representations can improve LLM-based generation of executable decision models from legal text. We compare four input conditions: raw legal text, text enriched with semantic role labels, text enriched with input and output constraints, and text enriched with both. We evaluate along two dimensions: structural evaluation, through similarity to gold decision models with graph kernels and graphs' descriptive statistics, and outcome evaluation, through functional equivalence by executing models on pre-configured test scenarios. Our findings show that I/O constraints provide the dominant improvement (+37-54% similarity over baseline), while semantic role labels show modest improvements. Outcome evaluation shows that generated models match the gold standard on 51-53% of test scenarios, even though generated models are typically smaller and simpler. We find LLMs eliminate redundant pass-through logic that comprises up to 45-55% of nodes. Importantly, structural similarity and outcome equivalence are complementary: structural similarity does not guarantee outcome equivalence, and vice versa. To facilitate reproducibility, we publicly release our dataset of 95 production decision models with associated legal text and all experimental code.

Problem

Research questions and friction points this paper is trying to address.

legal text

executable decision models

structured representations

legal informatics

LLM-based generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

structured representations

legal decision models

input-output constraints