SoC: Semantic Orthogonal Calibration for Test-Time Prompt Tuning

📅 2026-01-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the poor uncertainty calibration of vision-language models under test-time prompt tuning, where overconfidence often undermines reliability. While existing full orthogonality constraints improve prototype separation, they inadvertently disrupt the proximity of semantically related categories. This study is the first to identify and analyze this detrimental trade-off. To reconcile prototype separation with semantic coherence, we propose a novel semantic orthogonality calibration method based on Huber regularization. Our approach effectively preserves semantic similarity among related classes while enhancing inter-class separability. Extensive experiments across multiple benchmarks demonstrate that the proposed method significantly improves calibration performance without compromising discriminative capability, achieving competitive accuracy alongside well-calibrated predictions.

Technology Category

Application Category

📝 Abstract
With the increasing adoption of vision-language models (VLMs) in critical decision-making systems such as healthcare or autonomous driving, the calibration of their uncertainty estimates becomes paramount. Yet, this dimension has been largely underexplored in the VLM test-time prompt-tuning (TPT) literature, which has predominantly focused on improving their discriminative performance. Recent state-of-the-art advocates for enforcing full orthogonality over pairs of text prompt embeddings to enhance separability, and therefore calibration. Nevertheless, as we theoretically show in this work, the inherent gradients from fully orthogonal constraints will strongly push semantically related classes away, ultimately making the model overconfident. Based on our findings, we propose Semantic Orthogonal Calibration (SoC), a Huber-based regularizer that enforces smooth prototype separation while preserving semantic proximity, thereby improving calibration compared to prior orthogonality-based approaches. Across a comprehensive empirical validation, we demonstrate that SoC consistently improves calibration performance, while also maintaining competitive discriminative capabilities.
Problem

Research questions and friction points this paper is trying to address.

calibration
vision-language models
test-time prompt tuning
orthogonality
semantic proximity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic Orthogonal Calibration
Test-Time Prompt Tuning
Vision-Language Models
Uncertainty Calibration
Huber Regularization
🔎 Similar Papers
No similar papers found.
L
Leo Fillioux
MICS, CentraleSupélec, Université Paris-Saclay, France
Omprakash Chakraborty
Omprakash Chakraborty
PhD Scholar of Indian Institute of Technology(IIT), Kharagpur
Computer VisionMachine IntelligenceGIS
Ismail Ben Ayed
Ismail Ben Ayed
Professor, ETS Montreal
computer visionmachine learningoptimizationmedical image analysis
P
P. Cournède
MICS, CentraleSupélec, Université Paris-Saclay, France
S
S. Christodoulidis
MICS, CentraleSupélec, Université Paris-Saclay, France
M
M. Vakalopoulou
MICS, CentraleSupélec, Université Paris-Saclay, France
J
J. Dolz
LIVIA, ILLS, ÉTS Montréal, Canada