Decision Potential Surface: A Theoretical and Practical Approximation of LLM's Decision Boundary

📅 2025-09-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The decision boundaries of large language models (LLMs) are inherently difficult to characterize precisely due to autoregressive generation and the exponential size of the sequence space. Method: This paper introduces the Decision Potential Surface (DPS), the first formal framework modeling an LLM’s decision boundary as the zero-level contour of the DPS, with rigorous proof of equivalence. Building on this, we propose the K-DPS algorithm, which achieves efficient, provably convergent approximation of the boundary via finite sequence sampling and confidence-aware modeling—without requiring model gradients or architectural knowledge. Our analysis leverages probabilistic reasoning and concentration inequalities. Results: Extensive experiments across diverse corpora and mainstream LLMs (e.g., Llama, Qwen) demonstrate that K-DPS attains high-accuracy boundary approximation at low computational cost, with theoretically guaranteed error bounds that decay controllably with sampling budget. This work establishes a novel paradigm for interpretability and robustness analysis of LLMs.

Technology Category

Application Category

📝 Abstract
Decision boundary, the subspace of inputs where a machine learning model assigns equal classification probabilities to two classes, is pivotal in revealing core model properties and interpreting behaviors. While analyzing the decision boundary of large language models (LLMs) has raised increasing attention recently, constructing it for mainstream LLMs remains computationally infeasible due to the enormous vocabulary-sequence sizes and the auto-regressive nature of LLMs. To address this issue, in this paper we propose Decision Potential Surface (DPS), a new notion for analyzing LLM decision boundary. DPS is defined on the confidences in distinguishing different sampling sequences for each input, which naturally captures the potential of decision boundary. We prove that the zero-height isohypse in DPS is equivalent to the decision boundary of an LLM, with enclosed regions representing decision regions. By leveraging DPS, for the first time in the literature, we propose an approximate decision boundary construction algorithm, namely $K$-DPS, which only requires K-finite times of sequence sampling to approximate an LLM's decision boundary with negligible error. We theoretically derive the upper bounds for the absolute error, expected error, and the error concentration between K-DPS and the ideal DPS, demonstrating that such errors can be trade-off with sampling times. Our results are empirically validated by extensive experiments across various LLMs and corpora.
Problem

Research questions and friction points this paper is trying to address.

Proposes Decision Potential Surface to approximate LLM decision boundaries
Addresses computational infeasibility of constructing LLM decision boundaries
Provides theoretical error bounds for practical decision boundary approximation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes Decision Potential Surface for boundary analysis
Uses K-finite sampling to approximate decision boundary
Theoretically bounds approximation error with sampling trade-offs
🔎 Similar Papers
No similar papers found.