Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval

📅 2024-03-27

🏛️ arXiv.org

📈 Citations: 9

✨ Influential: 2

career value

173K/year

🤖 AI Summary

Legal case retrieval suffers from heavy reliance on expert judgment for relevance assessment, poor interpretability, and low efficiency. Method: This paper proposes a few-shot, multi-stage large language model (LLM) reasoning framework that emulates the incremental reasoning process of human legal experts. It integrates legal fact extraction, expert-aligned fine-tuning, and annotation-based knowledge distillation to transfer capabilities from large models to compact ones. Contribution/Results: We introduce the first domain-specific, multi-stage legal reasoning paradigm, balancing interpretability with high-fidelity annotation alignment. Experiments show κ > 0.82 agreement between the framework’s outputs and human expert annotations. With only minimal expert labeling, a lightweight model achieves 92% of the performance of its large-model counterpart, substantially reducing domain adaptation costs while preserving legal reasoning fidelity.

Technology Category

Application Category

📝 Abstract

Determining which legal cases are relevant to a given query involves navigating lengthy texts and applying nuanced legal reasoning. Traditionally, this task has demanded significant time and domain expertise to identify key Legal Facts and reach sound juridical conclusions. In addition, existing data with legal case similarities often lack interpretability, making it difficult to understand the rationale behind relevance judgments. With the growing capabilities of large language models (LLMs), researchers have begun investigating their potential in this domain. Nonetheless, the method of employing a general large language model for reliable relevance judgments in legal case retrieval remains largely unexplored. To address this gap in research, we propose a novel few-shot approach where LLMs assist in generating expert-aligned interpretable relevance judgments. The proposed approach decomposes the judgment process into several stages, mimicking the workflow of human annotators and allowing for the flexible incorporation of expert reasoning to improve the accuracy of relevance judgments. Importantly, it also ensures interpretable data labeling, providing transparency and clarity in the relevance assessment process. Through a comparison of relevance judgments made by LLMs and human experts, we empirically demonstrate that the proposed approach can yield reliable and valid relevance assessments. Furthermore, we demonstrate that with minimal expert supervision, our approach enables a large language model to acquire case analysis expertise and subsequently transfers this ability to a smaller model via annotation-based knowledge distillation.

Problem

Research questions and friction points this paper is trying to address.

Automating legal case relevance judgments using LLMs

Improving interpretability of legal case similarity data

Enhancing accuracy via expert-aligned few-shot LLM approach

Innovation

Methods, ideas, or system contributions that make the work stand out.

Few-shot LLM approach for legal relevance judgments

Decomposed judgment process mimicking human workflow

Interpretable data labeling with expert reasoning

🔎 Similar Papers

Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study of Chinese Criminal Law