Smart Eyes for Silent Threats: VLMs and In-Context Learning for THz Imaging

📅 2025-07-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Terahertz (THz) imaging suffers from scarce annotations, low spatial resolution, and intrinsic visual ambiguity, severely limiting classification performance. To address this, we propose the first vision-language contextual learning framework tailored for THz image understanding—requiring no model fine-tuning and enabling cross-modal feature adaptation via a novel modality-aligned prompting mechanism. Our method synergistically integrates zero-shot and one-shot contextual learning, leveraging open-source vision-language models to achieve substantial gains in classification accuracy and decision interpretability under extreme data scarcity. Key contributions include: (i) the first application of contextual learning to THz image analysis; (ii) a lightweight, training-free, and inherently interpretable few-shot classification paradigm. Extensive experiments validate its effectiveness and practical advantages in resource-constrained scientific applications, particularly security screening. (138 words)

Technology Category

Application Category

📝 Abstract
Terahertz (THz) imaging enables non-invasive analysis for applications such as security screening and material classification, but effective image classification remains challenging due to limited annotations, low resolution, and visual ambiguity. We introduce In-Context Learning (ICL) with Vision-Language Models (VLMs) as a flexible, interpretable alternative that requires no fine-tuning. Using a modality-aligned prompting framework, we adapt two open-weight VLMs to the THz domain and evaluate them under zero-shot and one-shot settings. Our results show that ICL improves classification and interpretability in low-data regimes. This is the first application of ICL-enhanced VLMs to THz imaging, offering a promising direction for resource-constrained scientific domains. Code: href{https://github.com/Nicolas-Poggi/Project_THz_Classification/tree/main}{GitHub repository}.
Problem

Research questions and friction points this paper is trying to address.

Improve THz image classification with limited annotations
Enhance interpretability in low-data regimes using VLMs
Adapt VLMs to THz domain without fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

In-Context Learning with Vision-Language Models
Modality-aligned prompting for THz adaptation
Zero-shot and one-shot classification enhancement
🔎 Similar Papers
No similar papers found.