Mining Legal Arguments to Study Judicial Formalism

πŸ“… 2025-12-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
The prevailing narrative of judicial formalism in Central and Eastern Europe (CEE) lacks empirical validation. Method: We introduce MADON, the first fine-grained annotated Czech judicial argumentation dataset, and propose a multi-stage hybrid NLP framework integrating continued pretraining of Czech legal BERT, a lightweight Llama 3.1 reasoning module, asymmetric loss with class weighting, and interpretable traditional feature engineering. Contribution/Results: We establish the first computationally grounded classification paradigm for judicial philosophy, achieving strong performance on argument paragraph detection (macro-F1 = 82.6%), eight-way argument type classification (77.5%), and formalist judgment identification (83.2%). Our findings robustly refute the dominant claim of pervasive judicial formalism across CEE jurisdictions. The methodology demonstrates cross-jurisdictional reproducibility and is fully open-sourced.

Technology Category

Application Category

πŸ“ Abstract
Courts must justify their decisions, but systematically analyzing judicial reasoning at scale remains difficult. This study refutes claims about formalistic judging in Central and Eastern Europe (CEE) by developing automated methods to detect and classify judicial reasoning in Czech Supreme Courts' decisions using state-of-the-art natural language processing methods. We create the MADON dataset of 272 decisions from two Czech Supreme Courts with expert annotations of 9,183 paragraphs with eight argument types and holistic formalism labels for supervised training and evaluation. Using a corpus of 300k Czech court decisions, we adapt transformer LLMs for Czech legal domain by continued pretraining and experiment with methods to address dataset imbalance including asymmetric loss and class weighting. The best models successfully detect argumentative paragraphs (82.6% macro-F1), classify traditional types of legal argument (77.5% macro-F1), and classify decisions as formalistic/non-formalistic (83.2% macro-F1). Our three-stage pipeline combining ModernBERT, Llama 3.1, and traditional feature-based machine learning achieves promising results for decision classification while reducing computational costs and increasing explainability. Empirically, we challenge prevailing narratives about CEE formalism. This work shows that legal argument mining enables reliable judicial philosophy classification and shows the potential of legal argument mining for other important tasks in computational legal studies. Our methodology is easily replicable across jurisdictions, and our entire pipeline, datasets, guidelines, models, and source codes are available at https://github.com/trusthlt/madon.
Problem

Research questions and friction points this paper is trying to address.

Automated detection of judicial reasoning patterns in court decisions
Classification of legal arguments and judicial formalism using NLP
Empirical challenge to prevailing narratives about Central European judicial formalism
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapt transformer LLMs for Czech legal domain via continued pretraining
Address dataset imbalance with asymmetric loss and class weighting
Combine ModernBERT, Llama 3.1, and traditional ML for explainable classification
πŸ”Ž Similar Papers
No similar papers found.
T
TomΓ‘Ε‘ Koref
Center for Critical Computational Studies, Goethe University Frankfurt, Frankfurt am Main, Germany
L
Lena Held
Trustworthy Human Language Technologies, Technical University of Darmstadt, Germany
M
Mahammad Namazov
Trustworthy Human Language Technologies, Research Center Trustworthy Data Science and Security of the University Alliance Ruhr & Ruhr University Bochum, Germany
H
Harun Kumru
Trustworthy Human Language Technologies, Research Center Trustworthy Data Science and Security of the University Alliance Ruhr & Ruhr University Bochum, Germany
Y
Yassine Thlija
Trustworthy Human Language Technologies, Technical University of Darmstadt, Germany
C
Christoph Burchard
Center for Critical Computational Studies, Goethe University Frankfurt, Frankfurt am Main, Germany
Ivan Habernal
Ivan Habernal
Ruhr University Bochum
natural language processingprivacy-preserving NLPlegal NLPargumentation mining