๐ค AI Summary
This work addresses the semantic gap between usersโ colloquial legal queries and formal legal documents by proposing a generative reasoningโbased approach for legal case retrieval. The method reformulates retrieval as a joint generation process of charges and legal elements, explicitly modeling legal reasoning logic and integrating a multi-view evidence fusion mechanism to enable precise ranking. It introduces a logically consistent generation strategy and an interpretable multi-stage retrieval framework, achieving substantial improvements over strong baselines such as SAILER and KELLER on the LeCaRD and LeCaRDv2 benchmarks. Notably, the model maintains superior performance even when trained on only 10% of the available data, demonstrating enhanced data efficiency and retrieval relevance.
๐ Abstract
The semantic gap between colloquial user queries and professional legal documents presents a fundamental challenge in Legal Case Retrieval (LCR). Existing dense retrieval methods typically treat LCR as a black-box semantic matching process, neglecting the explicit juridical logic that underpins legal relevance. To address this, we propose GLIER (Generative Legal Inference and Evidence Ranking), a framework that reformulates retrieval as an inference process over latent legal variables. GLIER decomposes the task into two interpretability-driven stages. First, a Joint Generative Inference module translates raw queries into latent legal indicators, including charges and legal elements, using a unified sequence-to-sequence strategy that jointly generates charges and elements to enforce logical consistency. Second, a Multi-View Evidence Fusion mechanism aggregates generative confidence with structural and lexical signals for precise ranking. Extensive experiments on LeCaRD and LeCaRDv2 demonstrate that GLIER outperforms strong baselines such as SAILER and KELLER. Notably, GLIER exhibits strong data efficiency, maintaining robust performance even when trained with only 10% of the data.