🤖 AI Summary
This work addresses the unpredictable interpretive choices often implicit in large language model (LLM) formalizations of legal provisions, which undermine the comparability and explainability of reasoning outcomes. The authors propose a systematic approach that integrates graph node matching with SAT solvers to enumerate divergent inferences arising from alternative formalizations when applied to identical legal cases. These divergences are then rendered into natural-language scenarios amenable to expert legal review. For the first time, this method maps formalization discrepancies onto intelligible edge cases, revealing their qualitative connection to real-world legal disputes. Experiments on ten EU legal provisions demonstrate that structural similarity among formalizations correlates poorly with behavioral agreement, whereas the generated divergence cases effectively capture actual conflicts in legal interpretation.
📝 Abstract
Formalizing legal provisions promises machine-accessible law and automated legal reasoning, and recent LLMs make it tempting to generate such formalizations directly from statutory text. However, any formalization makes implicit interpretive choices whose consequences are hard to anticipate, especially if an LLM is the author. We present a method for systematically comparing different formalizations of the same legal provision by their inferences on individual cases. Given multiple formalizations of a provision, we match them at the node level, derive a shared interface for each pair from the matching, and use a SAT solver to enumerate the edge cases on which any two formalizations disagree. Selected edge cases are then verbalized into concrete factual scenarios that a legal expert can examine and act on. We apply our method to formalizations of ten EU provisions generated by nine frontier LLMs. We find that behavioral divergence between formalizations is essentially uncorrelated with their structural agreement and that the verbalized cases reveal qualitatively distinct types of disagreement, including divergences that mirror genuine controversies in the legal commentary.