Learning to Route LLMs with Confidence Tokens

📅 2024-10-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the unreliability and inaccurate confidence estimation of large language model (LLM) outputs in high-stakes scenarios, this paper introduces Confidence Tokens: learnable, end-to-end differentiable special tokens explicitly injected into the output sequence to directly model the probability of answer correctness—replacing heuristic verbal expressions or post-hoc calibration. The method integrates self-supervised refinement (Self-REF), confidence token embedding and decoding, lightweight parameter updates, and joint optimization of routing decisions and answer rejection. Evaluated on a multi-task dynamic routing and answer rejection benchmark, it reduces confidence calibration error by 37% and improves downstream task accuracy by 12.5%, significantly outperforming existing baselines. This work presents the first explicit confidence modeling framework that is highly discriminative, fully differentiable, and jointly optimized with core generation and decision-making components.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have demonstrated impressive performance on several tasks and are increasingly deployed in real-world applications. However, especially in high-stakes settings, it becomes vital to know when the output of an LLM may be unreliable. Depending on whether an answer is trustworthy, a system can then choose to route the question to another expert, or otherwise fall back on a safe default behavior. In this work, we study the extent to which LLMs can reliably indicate confidence in their answers, and how this notion of confidence can translate into downstream accuracy gains. We propose Self-REF, a lightweight training strategy to teach LLMs to express confidence in whether their answers are correct in a reliable manner. Self-REF introduces confidence tokens into the LLM, from which a confidence score can be extracted. Compared to conventional approaches such as verbalizing confidence and examining token probabilities, we demonstrate empirically that confidence tokens show significant improvements in downstream routing and rejection learning tasks.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM reliability via confidence tokens
Improving downstream accuracy with Self-REF strategy
Routing decisions based on LLM confidence scores
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-REF training strategy
Confidence tokens in LLMs
Improved downstream routing tasks
🔎 Similar Papers
No similar papers found.