🤖 AI Summary
This work identifies a pervasive confidence miscalibration bias in large language models (LLMs) under multilingual settings—particularly severe for non-English languages—stemming from English-dominant training biases embedded in the final output layer. To address this, the authors first observe that late intermediate layers exhibit superior multilingual calibration capability compared to the output layer. Building on this insight, they propose LACE (Language-Aware Calibration via Ensemble), a fine-tuning-free, language-aware layer ensemble method that leverages inter-layer representation analysis, dynamic weight assignment, and language-adaptive integration to improve cross-lingual confidence calibration. Extensive experiments across six model families and over 100 languages demonstrate that LACE significantly enhances calibration for non-English languages, reducing Expected Calibration Error (ECE) by an average of 32%. This work establishes a new paradigm for developing fair, trustworthy, and globally applicable LLMs.
📝 Abstract
Confidence calibration, the alignment of a model's predicted confidence with its actual accuracy, is crucial for the reliable deployment of Large Language Models (LLMs). However, this critical property remains largely under-explored in multilingual contexts. In this work, we conduct the first large-scale, systematic studies of multilingual calibration across six model families and over 100 languages, revealing that non-English languages suffer from systematically worse calibration. To diagnose this, we investigate the model's internal representations and find that the final layer, biased by English-centric training, provides a poor signal for multilingual confidence. In contrast, our layer-wise analysis uncovers a key insight that late-intermediate layers consistently offer a more reliable and better-calibrated signal. Building on this, we introduce a suite of training-free methods, including Language-Aware Confidence Ensemble (LACE), which adaptively selects an optimal ensemble of layers for each specific language. Our study highlights the hidden costs of English-centric alignment and offer a new path toward building more globally equitable and trustworthy LLMs by looking beyond the final layer.