From Legal Text to Executable Decision Models: Evaluating Structured Representations for Legal Decision Model Generation

๐Ÿ“… 2026-04-18
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

177K/year
๐Ÿค– AI Summary
This work addresses the longstanding challenge of automatically translating legal texts into executable decision logic, which has traditionally relied on manual encoding and evaluation. The authors propose enhancing large language models by introducing an intermediate structured representation and present the first systematic assessmentโ€”based on real-world data from the Dutch Environmental Planning Actโ€”of how input/output constraints and semantic role labeling influence the structural and functional equivalence of generated logic. Experimental results demonstrate that incorporating I/O constraints improves structural similarity by 37โ€“54%, achieves functional equivalence in 51โ€“53% of test cases, and automatically eliminates 45โ€“55% of redundant logic nodes, thereby revealing a notable inconsistency between structural similarity and functional equivalence.

Technology Category

Application Category

๐Ÿ“ Abstract
Transforming legal text into executable decision logic is a longstanding challenge in legal informatics. With the rise of LLMs, this task has gained renewed interest, but remains challenging due to requiring extensive manual coding and evaluation. We use a unique real-world dataset that pairs production-grade decision models with legal text from the Dutch Environment and Planning Act. These models power the Omgevingsloket government platform, where citizens check permit requirements for environmental activities. We study whether intermediate structured representations can improve LLM-based generation of executable decision models from legal text. We compare four input conditions: raw legal text, text enriched with semantic role labels, text enriched with input and output constraints, and text enriched with both. We evaluate along two dimensions: structural evaluation, through similarity to gold decision models with graph kernels and graphs' descriptive statistics, and outcome evaluation, through functional equivalence by executing models on pre-configured test scenarios. Our findings show that I/O constraints provide the dominant improvement (+37-54% similarity over baseline), while semantic role labels show modest improvements. Outcome evaluation shows that generated models match the gold standard on 51-53% of test scenarios, even though generated models are typically smaller and simpler. We find LLMs eliminate redundant pass-through logic that comprises up to 45-55% of nodes. Importantly, structural similarity and outcome equivalence are complementary: structural similarity does not guarantee outcome equivalence, and vice versa. To facilitate reproducibility, we publicly release our dataset of 95 production decision models with associated legal text and all experimental code.
Problem

Research questions and friction points this paper is trying to address.

legal text
executable decision models
structured representations
legal informatics
LLM-based generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

structured representations
legal decision models
input-output constraints
LLM-based code generation
functional equivalence
๐Ÿ”Ž Similar Papers