🤖 AI Summary
Existing approaches to identifying normatively binding regulatory statements in legislative texts lack a robust institutional semantics foundation, hindering rigorous quantification of regulatory density and strictness. Method: This study introduces the first institutional-grammar-based formal definition of regulatory statements, applied to ~180,000 EU legislative texts (1952–2023). We propose a dual-path identification framework integrating dependency parsing and Transformer-based language modeling, explicitly leveraging their complementary strengths in recall and precision. Contribution/Results: Experimental evaluation yields accuracies of 80% (dependency path) and 84% (Transformer path), with inter-annotator agreement measured via Krippendorff’s alpha = 0.58—demonstrating both methodological validity and synergistic feasibility. The framework establishes a reproducible, interpretable, and scalable methodology for automated regulatory analysis of large-scale legal corpora.
📝 Abstract
Identifying regulatory statements in legislation is useful for developing metrics to measure the regulatory density and strictness of legislation. A computational method is valuable for scaling the identification of such statements from a growing body of EU legislation, constituting approximately 180,000 published legal acts between 1952 and 2023. Past work on extraction of these statements varies in the permissiveness of their definitions for what constitutes a regulatory statement. In this work, we provide a specific definition for our purposes based on the institutional grammar tool. We develop and compare two contrasting approaches for automatically identifying such statements in EU legislation, one based on dependency parsing, and the other on a transformer-based machine learning model. We found both approaches performed similarly well with accuracies of 80% and 84% respectively and a K alpha of 0.58. The high accuracies and not exceedingly high agreement suggests potential for combining strengths of both approaches.