Hystar: Hypernetwork-driven Style-adaptive Retrieval via Dynamic SVD Modulation

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

221K/year
🤖 AI Summary
This work addresses the distribution shift caused by diverse query styles—such as sketches, artistic renderings, and low-resolution images—in image retrieval and classification tasks. To tackle this challenge, the authors propose Hystar, a lightweight framework that, for the first time, integrates hypernetworks with dynamic singular value decomposition (SVD) modulation to generate input-style-adaptive perturbations for attention layers, while introducing static SVD offsets in MLP layers to enhance cross-style stability. Additionally, they design a StyleNCE loss based on optimal transport weighting to improve semantic discrimination of hard negative samples. Evaluated on multi-style image retrieval and cross-style classification benchmarks, Hystar achieves state-of-the-art performance while maintaining high parameter efficiency and robustness across diverse visual styles.
📝 Abstract
Query-based image retrieval (QBIR) requires retrieving relevant images given diverse and often stylistically heterogeneous queries, such as sketches, artworks, or low-resolution previews. While large-scale vision--language representation models (VLRMs) like CLIP offer strong zero-shot retrieval performance, they struggle with distribution shifts caused by unseen query styles. In this paper, we propose the Hypernetwork-driven Style-adaptive Retrieval (Hystar), a lightweight framework that dynamically adapts model weights to each query's style. Hystar employs a hypernetwork to generate singular-value perturbations ($ΔS$) for attention layers, enabling flexible per-input adaptation, while static singular-value offsets on MLP layers ensure cross-style stability. To better handle semantic confusions across styles, we design StyleNCE as part of Hystar, an optimal-transport-weighted contrastive loss that emphasizes hard cross-style negatives. Extensive experiments on multi-style retrieval and cross-style classification benchmarks demonstrate that Hystar consistently outperforms strong baselines, achieving state-of-the-art performance while being parameter-efficient and stable across styles.
Problem

Research questions and friction points this paper is trying to address.

query-based image retrieval
style heterogeneity
distribution shift
zero-shot retrieval
cross-style adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypernetwork
Dynamic SVD Modulation
Style-adaptive Retrieval
StyleNCE
Cross-style Generalization