AnchorOPT: Towards Optimizing Dynamic Anchors for Adaptive Prompt Learning

📅 2025-11-26

📈 Citations: 0

✨ Influential: 0

career value

141K/year

🤖 AI Summary

Existing CLIP prompt learning methods rely on static text anchors—both their values and positions are fixed—limiting adaptability across diverse tasks and training stages. To address this, we propose AnchorOPT, the first framework to jointly optimize both anchor values and positions dynamically. AnchorOPT introduces a task- and stage-aware positional matrix and jointly learns textual anchors, soft tokens, and conditional positional embeddings. It employs a lightweight two-stage training strategy, requiring no additional regularization or complex architectural components. Evaluated on multiple cross-domain datasets, AnchorOPT achieves performance on par with or superior to state-of-the-art methods, despite its significantly simpler architecture. Moreover, as a plug-and-play module, it consistently enhances the generalization and transferability of various CLIP prompt learning frameworks.

Technology Category

Application Category

📝 Abstract

Existing prompt learning methods, which are built upon CLIP models, leverage textual tokens as anchors to guide the learnable soft tokens. This guidance improves CLIP generalizations. However, these anchors-static in both value and position-lack cross-task and stage-adaptive flexibility. To address this limitation, we propose AnchorOPT, a dynamic anchor-based prompt learning framework. Specifically, AnchorOPT introduces dynamism in two key dimensions: (i) anchor values eschew handcrafted explicit textual tokens (e.g., "shape", "color"), instead learning dynamically from task-specific data; and (ii) the positional relationship between anchor and soft tokens is no longer fixed but adaptively optimized via a learnable position matrix conditioned on the training stage and task context. Training occurs in two stages: we first learn the anchor tokens, then freeze and transfer them to the second stage for optimization of soft tokens and the position matrix. Extensive experiments demonstrate that using only a simple learnable anchor and position matrix achieves performance comparable to or exceeding some methods incorporating additional learnable modules or regularization techniques. As a plug-and-play module, AnchorOPT integrates seamlessly into existing frameworks, yielding consistent performance gains across diverse datasets. Code is publicly available at https://github.com/zhengli97/ATPrompt.

Problem

Research questions and friction points this paper is trying to address.

Dynamic anchors replace static textual tokens for adaptive prompt learning

Optimizes anchor values and positions based on task-specific data

Enhances CLIP generalization without additional modules or regularization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic anchor tokens learned from task-specific data

Adaptive positional optimization via learnable matrix

Two-stage training with frozen anchor transfer

🔎 Similar Papers

Prompt Selection Matters: Enhancing Text Annotations for Social Sciences with Large Language Models