Purging the Gray Zone: Latent-Geometric Denoising for Precise Knowledge Boundary Awareness

📅 2026-04-15

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Large language models are prone to hallucination due to their inability to accurately delineate the boundaries of their knowledge. Existing abstention-based fine-tuning approaches suffer from severe label noise near decision boundaries, often resulting in either excessive abstention or frequent hallucinations. This work proposes GeoDe, a novel framework that introduces a geometric denoising mechanism into abstention learning. GeoDe employs a linear probe to construct a ground-truth hyperplane in the latent space and uses the geometric distance of samples to this hyperplane as a confidence signal to filter ambiguous examples while preserving high-quality data for fine-tuning. By precisely characterizing the model’s knowledge boundary, GeoDe effectively mitigates label noise caused by boundary ambiguity, significantly improving response faithfulness on benchmarks such as TriviaQA and Natural Questions across models including Llama3 and Qwen3, while also demonstrating strong out-of-distribution generalization.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) often exhibit hallucinations due to their inability to accurately perceive their own knowledge boundaries. Existing abstention fine-tuning methods typically partition datasets directly based on response accuracy, causing models to suffer from severe label noise near the decision boundaries and consequently exhibit high rates of abstentions or hallucinations. This paper adopts a latent space representation perspective, revealing a "gray zone" near the decision hyperplane where internal belief ambiguity constitutes the core performance bottleneck. Based on this insight, we propose the **GeoDe** (**Geo**metric **De**noising) framework for abstention fine-tuning. This method constructs a truth hyperplane using linear probes and performs "geometric denoising" by employing geometric distance as a confidence signal for abstention decisions. This approach filters out ambiguous boundary samples while retaining high-fidelity signals for fine-tuning. Experiments across multiple models (Llama3, Qwen3) and benchmark datasets (TriviaQA, NQ, SciQ, SimpleQA) demonstrate that GeoDe significantly enhances model truthfulness and demonstrates strong generalization in out-of-distribution (OOD) scenarios. Code is available at https://github.com/Notbesidemoon/GeoDe.

Problem

Research questions and friction points this paper is trying to address.

knowledge boundary

hallucination

abstention

label noise

decision boundary

Innovation

Methods, ideas, or system contributions that make the work stand out.

geometric denoising

knowledge boundary awareness

latent space representation