🤖 AI Summary
Legal texts pose high comprehension barriers for non-experts, hindering rapid extraction of core elements (e.g., entities, transactions, legal sources, statements).
Method: This paper proposes LegalViz—a framework for cross-lingual legal diagram generation—built upon a novel multilingual legal vision-language pairing dataset (7,010 cases) and Graphviz DOT-based structural modeling. It leverages few-shot learning and LLM fine-tuning to generate interpretable legal diagrams.
Contribution/Results: We introduce a novel evaluation metric integrating graph structure, textual semantics, and domain-specific legal knowledge, alongside a legal-content-aware cross-lingual assessment methodology. Experiments span 23 languages; the fine-tuned model significantly outperforms GPT-series models in both structural accuracy and jurisprudential consistency. LegalViz establishes a new paradigm for legal visualization—interpretable, rigorously evaluable, and multilingually compatible.
📝 Abstract
Legal documents including judgments and court orders require highly sophisticated legal knowledge for understanding. To disclose expert knowledge for non-experts, we explore the problem of visualizing legal texts with easy-to-understand diagrams and propose a novel dataset of LegalViz with 23 languages and 7,010 cases of legal document and visualization pairs, using the DOT graph description language of Graphviz. LegalViz provides a simple diagram from a complicated legal corpus identifying legal entities, transactions, legal sources, and statements at a glance, that are essential in each judgment. In addition, we provide new evaluation metrics for the legal diagram visualization by considering graph structures, textual similarities, and legal contents. We conducted empirical studies on few-shot and finetuning large language models for generating legal diagrams and evaluated them with these metrics, including legal content-based evaluation within 23 languages. Models trained with LegalViz outperform existing models including GPTs, confirming the effectiveness of our dataset.