🤖 AI Summary
Large language models (LLMs) exhibit limited performance in Malaysian contract law IRAC (Issue, Rule, Application, Conclusion) reasoning due to insufficient domain-specific legal terminology mastery and shallow legal knowledge grounding.
Method: We introduce LEGALSEMI—the first semi-structured benchmark explicitly designed for IRAC-based legal reasoning—comprising 54 expert-annotated scenarios and a complementary Structured Knowledge Graph (SKG). Our methodology systematically integrates the IRAC framework into dataset design, proposing a novel “semi-structured scenario annotation + SKG co-enhancement” paradigm.
Contribution/Results: Integrating SKG into Llama-3, Qwen, GPT-4, and Claude-3 yields an average 23.7% improvement in F1 scores across all four IRAC subtasks. This demonstrates the efficacy of a data–knowledge dual-driven approach for modeling multi-step legal reasoning, establishing a foundational resource and methodology for domain-adapted legal LLM evaluation and enhancement.
📝 Abstract
The effectiveness of Large Language Models (LLMs) in legal reasoning is often limited due to the unique legal terminologies and the necessity for highly specialized knowledge. These limitations highlight the need for high-quality data tailored for complex legal reasoning tasks. This paper introduces LEGALSEMI, a benchmark specifically curated for legal scenario analysis. LEGALSEMI comprises 54 legal scenarios, each rigorously annotated by legal experts, based on the comprehensive IRAC (Issue, Rule, Application, Conclusion) framework. In addition, LEGALSEMI is accompanied by a structured knowledge graph (SKG). A series of experiments were conducted to assess the usefulness of LEGALSEMI for IRAC analysis. The experimental results demonstrate the effectiveness of incorporating the SKG for issue identification, rule retrieval, application and conclusion generation using four different LLMs. LEGALSEMI will be publicly available upon acceptance of this paper.