TTL: Test-time Textual Learning for OOD Detection with Pretrained Vision-Language Models

📅 2026-04-17
📈 Citations: 0
Influential: 0
📄 PDF

career value

179K/year
🤖 AI Summary
This work addresses the limitations of existing test-time out-of-distribution (OOD) detection methods, which rely on fixed external OOD labels and struggle to adapt to open and dynamically evolving semantic spaces. To overcome this, we propose TTL (Test-Time Text Learning), a novel framework that eliminates the need for external OOD labels by dynamically learning OOD textual semantics directly from unlabeled test streams. TTL leverages learnable prompts and a pseudo-labeling mechanism to continuously capture emerging OOD knowledge. Innovatively, it incorporates an OOD knowledge purification strategy and a textual knowledge base to effectively suppress pseudo-label noise and enable stable score calibration across batches. Extensive experiments on two standard benchmarks and nine OOD datasets demonstrate that TTL significantly outperforms current approaches, validating the efficacy of text-driven adaptation in enhancing the robustness of test-time OOD detection.

Technology Category

Application Category

📝 Abstract
Vision-language models (VLMs) such as CLIP exhibit strong Out-of-distribution (OOD) detection capabilities by aligning visual and textual representations. Recent CLIP-based test-time adaptation methods further improve detection performance by incorporating external OOD labels. However, such labels are finite and fixed, while the real OOD semantic space is inherently open-ended. Consequently, fixed labels fail to represent the diverse and evolving OOD semantics encountered in test streams. To address this limitation, we introduce Test-time Textual Learning (TTL), a framework that dynamically learns OOD textual semantics from unlabeled test streams, without relying on external OOD labels. TTL updates learnable prompts using pseudo-labeled test samples to capture emerging OOD knowledge. To suppress noise introduced by pseudo-labels, we introduce an OOD knowledge purification strategy that selects reliable OOD samples for adaptation while suppressing noise. In addition, TTL maintains an OOD Textual Knowledge Bank that stores high-quality textual features, providing stable score calibration across batches. Extensive experiments on two standard benchmarks with nine OOD datasets demonstrate that TTL consistently achieves state-of-the-art performance, highlighting the value of textual adaptation for robust test-time OOD detection. Our code is available at https://github.com/figec/TTL.
Problem

Research questions and friction points this paper is trying to address.

Out-of-distribution detection
vision-language models
test-time adaptation
open-ended semantics
OOD labels
Innovation

Methods, ideas, or system contributions that make the work stand out.

test-time adaptation
out-of-distribution detection
vision-language models
textual prompting
pseudo-labeling
🔎 Similar Papers
No similar papers found.
J
Jinlun Ye
Sun Yat-sen University, Peng Cheng Laboratory, Key Laboratory of Machine Intelligence and Advanced Computing, MOE
J
Jiang Liao
China United Network Communications Corporation Limited Guangdong Branch
R
Runhe Lai
Sun Yat-sen University, Peng Cheng Laboratory, Key Laboratory of Machine Intelligence and Advanced Computing, MOE
X
Xinhua Lu
Sun Yat-sen University, Peng Cheng Laboratory, Key Laboratory of Machine Intelligence and Advanced Computing, MOE
Jiaxin Zhuang
Jiaxin Zhuang
PhD in CSE, HKUST
Computer VisionMedical Image AnalysisArtificial Intelligence
Z
Zhiyong Gan
China United Network Communications Corporation Limited Guangdong Branch
Ruixuan Wang
Ruixuan Wang
Sun Yat-Sen University
Computer visionpattern recognitionmachine learningmedical image analysis