SG-LegalCite: A Principle-Augmented Benchmark for Legal Citation Retrieval in Singapore Law

📅 2026-05-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
This study addresses a critical limitation in existing legal citation retrieval benchmarks, which rely solely on factual similarity while neglecting explicit legal principles—a shortcoming particularly pronounced in Singapore’s legal system, where only domestic precedents are binding. To bridge this gap, the authors propose a novel retrieval paradigm that jointly leverages case facts and explicit legal principles. They construct a new dataset comprising 100,890 case–principle pairs derived from Singapore Supreme Court judgments and introduce SG-LegalCite, the first principle-augmented legal citation benchmark tailored to Singapore’s jurisprudence. Experimental results demonstrate that incorporating explicit legal principles significantly improves retrieval relevance by effectively filtering out factually similar but doctrinally irrelevant precedents, thereby better aligning with real-world legal reasoning practices.
📝 Abstract
Legal citation in common-law systems depends not only on factual similarity, but also on the legal principle for which a precedent is invoked. However, existing benchmarks for legal citation retrieval use case facts, citation context, or full judgments as inputs, where the governing legal principle is often missing or only implicitly expressed and entangled with broader context. As a result, models may retrieve precedents that are factually similar yet doctrinally irrelevant. This limitation is particularly consequential in Singapore, where the legal system has evolved independently: only domestic precedents are binding, while foreign authorities serve merely as persuasive references. Thus, we propose a new retrieval paradigm that ranks cited cases based on queries integrating case facts and explicit legal principles, inspired by real-world legal reasoning workflows. To support this paradigm, we introduce SG-LegalCite, a dataset of 100,890 case-principle pairs extracted from 8,523 Singapore Supreme Court judgments spanning from 2000 to 2025. Experiments across 11 baselines demonstrate the effectiveness of our principle-augmented retrieval paradigm, showing that explicit legal principles provide strong discriminative signals for legal citation retrieval.
Problem

Research questions and friction points this paper is trying to address.

legal citation retrieval
legal principle
Singapore law
precedent relevance
common-law systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

legal citation retrieval
legal principle
principle-augmented retrieval
Singapore law
case-principle pairs