🤖 AI Summary
This work addresses the challenge of dynamically varying model confidence across contexts in ensembles of large language models (LLMs). We propose LENS, a lightweight linear confidence predictor trained on layer-wise hidden states and normalized output probabilities—requiring no fine-tuning or parameter updates to the base LLMs. By modeling neural states to capture the correlation between internal representations and reliability, LENS enables efficient, zero-shot dynamic output weighting. The method significantly reduces computational overhead compared to gradient-based or parameter-intensive alternatives. Empirically, LENS consistently outperforms majority voting and logit-based ensemble baselines on multiple-choice and Boolean question-answering tasks. These results demonstrate the effectiveness and generalizability of leveraging fine-grained internal representations for confidence estimation. Overall, LENS establishes a new paradigm for efficient and robust collaborative inference across heterogeneous LLMs.
📝 Abstract
Large Language Models (LLMs) have demonstrated impressive performance across various tasks, with different models excelling in distinct domains and specific abilities. Effectively combining the predictions of multiple LLMs is crucial for enhancing system robustness and performance. However, existing ensemble methods often rely on simple techniques like voting or logits ensembling, which overlook the varying confidence and reliability of models in different contexts. In this work, we propose LENS (Learning ENsemble confidence from Neural States), a novel approach that learns to estimate model confidence by analyzing internal representations. For each LLM, we train a lightweight linear confidence predictor that leverages layer-wise hidden states and normalized probabilities as inputs. This allows for more nuanced weighting of model predictions based on their context-dependent reliability. Our method does not require modifying the model parameters and requires negligible additional computation. Experimental results on multiple-choice and boolean question-answering tasks demonstrate that LENS outperforms traditional ensemble methods by a substantial margin. Our findings suggest that internal representations provide valuable signals for determining model confidence and can be effectively leveraged for ensemble learning.