LegalSearchLM: Rethinking Legal Case Retrieval as Legal Elements Generation

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Existing legal case retrieval (LCR) research is constrained by small-scale, narrow-domain corpora (≤55K cases, covering only a few offenses) and reliance on embedding- or lexical-matching paradigms, resulting in weak semantic representation and low legal relevance. To address these limitations, we introduce LEGAR BENCH—the first large-scale Korean legal case retrieval benchmark—comprising 1.2 million annotated cases across 411 offense categories. We further propose LegalSearchLM, the first LCR model reframing retrieval as a legal-element-driven generative task: it employs structured case representation, legal-element reasoning modeling, and constrained decoding to ensure jurisprudentially aligned content generation. A cross-offense generalization training strategy is also introduced to enhance out-of-domain robustness. On LEGAR BENCH, LegalSearchLM outperforms all baselines by 6–20%, achieving state-of-the-art performance; it further improves generalization to unseen offense categories by 15%.

Technology Category

Application Category

📝 Abstract

Legal Case Retrieval (LCR), which retrieves relevant cases from a query case, is a fundamental task for legal professionals in research and decision-making. However, existing studies on LCR face two major limitations. First, they are evaluated on relatively small-scale retrieval corpora (e.g., 100-55K cases) and use a narrow range of criminal query types, which cannot sufficiently reflect the complexity of real-world legal retrieval scenarios. Second, their reliance on embedding-based or lexical matching methods often results in limited representations and legally irrelevant matches. To address these issues, we present: (1) LEGAR BENCH, the first large-scale Korean LCR benchmark, covering 411 diverse crime types in queries over 1.2M legal cases; and (2) LegalSearchLM, a retrieval model that performs legal element reasoning over the query case and directly generates content grounded in the target cases through constrained decoding. Experimental results show that LegalSearchLM outperforms baselines by 6-20% on LEGAR BENCH, achieving state-of-the-art performance. It also demonstrates strong generalization to out-of-domain cases, outperforming naive generative models trained on in-domain data by 15%.

Problem

Research questions and friction points this paper is trying to address.

Addresses limitations in Legal Case Retrieval (LCR) evaluation scale and query diversity

Improves legal case matching by generating legally relevant elements via constrained decoding

Introduces a large-scale Korean LCR benchmark with diverse crime types

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale Korean LCR benchmark with 1.2M cases

Legal element reasoning for query case analysis

Constrained decoding for grounded content generation

🔎 Similar Papers

Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval