Place Matters: Comparing LLM Hallucination Rates for Place-Based Legal Queries

📅 2025-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates regional disparities in hallucination rates of large language models (LLMs) when answering legal questions across three jurisdictions—Los Angeles, London, and Sydney—to assess how geographic factors affect the reliability of AI-powered legal information services. Method: We propose a cross-jurisdictional evaluation framework grounded in comparative functionalism, construct a test dataset derived from real-world legal queries on Reddit, and generate jurisdiction-specific statutory summaries using closed-source LLMs. Hallucination rates and model uncertainty are quantified via expert annotation and multi-round response consistency analysis. Contribution/Results: We find statistically significant regional variation in legal hallucination rates (p < 0.01), strongly negatively correlated with modal response frequency (r = −0.82), indicating systematic geographic inequity in LLM legal knowledge distribution. Crucially, this work introduces response consistency as a novel, reproducible metric for jurisdictional uncertainty—establishing a methodological foundation for fairness-aware evaluation of LLMs in legal applications.

Technology Category

Application Category

📝 Abstract
How do we make a meaningful comparison of a large language model's knowledge of the law in one place compared to another? Quantifying these differences is critical to understanding if the quality of the legal information obtained by users of LLM-based chatbots varies depending on their location. However, obtaining meaningful comparative metrics is challenging because legal institutions in different places are not themselves easily comparable. In this work we propose a methodology to obtain place-to-place metrics based on the comparative law concept of functionalism. We construct a dataset of factual scenarios drawn from Reddit posts by users seeking legal advice for family, housing, employment, crime and traffic issues. We use these to elicit a summary of a law from the LLM relevant to each scenario in Los Angeles, London and Sydney. These summaries, typically of a legislative provision, are manually evaluated for hallucinations. We show that the rate of hallucination of legal information by leading closed-source LLMs is significantly associated with place. This suggests that the quality of legal solutions provided by these models is not evenly distributed across geography. Additionally, we show a strong negative correlation between hallucination rate and the frequency of the majority response when the LLM is sampled multiple times, suggesting a measure of uncertainty of model predictions of legal facts.
Problem

Research questions and friction points this paper is trying to address.

Comparing LLM legal knowledge hallucination rates across different geographic locations
Developing methodology to quantify legal information quality variations by place
Analyzing correlation between hallucination rates and model uncertainty in legal predictions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Functional comparative law methodology for legal queries
Reddit-derived scenario dataset across multiple jurisdictions
Hallucination rate correlation with geographical location analysis
🔎 Similar Papers
No similar papers found.
D
Damian Curran
School of Computing and Information Systems, The University of Melbourne, Australia
V
Vanessa Sporne
The Centre for Artificial Intelligence and Digital Ethics
Lea Frermann
Lea Frermann
Computing and Information Systems, The University of Melbourne
Computational Linguisticscomputational cognitive modelingNLP for narrativesbias and fairness
J
Jeannie Paterson
The Centre for Artificial Intelligence and Digital Ethics