Towards Reliable and Holistic Visual In-Context Learning Prompt Selection

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Existing vision-in-context learning (VICL) methods rely on the similarity-prior assumption for example selection, lacking theoretical grounding; approaches like Partial2Global employ random sampling to construct global rankings, often yielding incomplete coverage and redundancy, thereby degrading ranking quality. This paper proposes RH-Partial2Global: it reformulates example selection as a global ranking problem, explicitly discarding implicit reliance on visual similarity. We introduce jackknife conformal prediction to construct a reliable candidate set and integrate coverage-based design to enable uniform and complete pairwise preference sampling. This significantly enhances both the robustness and completeness of the resulting ranking. Extensive experiments across multiple vision tasks demonstrate that RH-Partial2Global consistently outperforms Partial2Global, validating its superior performance and stability.

Technology Category

Application Category

📝 Abstract

Visual In-Context Learning (VICL) has emerged as a prominent approach for adapting visual foundation models to novel tasks, by effectively exploiting contextual information embedded in in-context examples, which can be formulated as a global ranking problem of potential candidates. Current VICL methods, such as Partial2Global and VPR, are grounded in the similarity-priority assumption that images more visually similar to a query image serve as better in-context examples. This foundational assumption, while intuitive, lacks sufficient justification for its efficacy in selecting optimal in-context examples. Furthermore, Partial2Global constructs its global ranking from a series of randomly sampled pairwise preference predictions. Such a reliance on random sampling can lead to incomplete coverage and redundant samplings of comparisons, thus further adversely impacting the final global ranking. To address these issues, this paper introduces an enhanced variant of Partial2Global designed for reliable and holistic selection of in-context examples in VICL. Our proposed method, dubbed RH-Partial2Global, leverages a jackknife conformal prediction-guided strategy to construct reliable alternative sets and a covering design-based sampling approach to ensure comprehensive and uniform coverage of pairwise preferences. Extensive experiments demonstrate that RH-Partial2Global achieves excellent performance and outperforms Partial2Global across diverse visual tasks.

Problem

Research questions and friction points this paper is trying to address.

Evaluating the similarity-priority assumption in visual in-context learning

Addressing incomplete coverage in pairwise preference sampling methods

Improving reliability and comprehensiveness of in-context example selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Jackknife conformal prediction constructs reliable alternative sets

Covering design sampling ensures comprehensive pairwise preference coverage

Enhanced Partial2Global method achieves holistic in-context example selection

🔎 Similar Papers

Cropper: Vision-Language Model for Image Cropping through In-Context Learning