Hystar: Hypernetwork-driven Style-adaptive Retrieval via Dynamic SVD Modulation

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the distribution shift caused by diverse query styles—such as sketches, artistic renderings, and low-resolution images—in image retrieval and classification tasks. To tackle this challenge, the authors propose Hystar, a lightweight framework that, for the first time, integrates hypernetworks with dynamic singular value decomposition (SVD) modulation to generate input-style-adaptive perturbations for attention layers, while introducing static SVD offsets in MLP layers to enhance cross-style stability. Additionally, they design a StyleNCE loss based on optimal transport weighting to improve semantic discrimination of hard negative samples. Evaluated on multi-style image retrieval and cross-style classification benchmarks, Hystar achieves state-of-the-art performance while maintaining high parameter efficiency and robustness across diverse visual styles.

📝 Abstract

Query-based image retrieval (QBIR) requires retrieving relevant images given diverse and often stylistically heterogeneous queries, such as sketches, artworks, or low-resolution previews. While large-scale vision--language representation models (VLRMs) like CLIP offer strong zero-shot retrieval performance, they struggle with distribution shifts caused by unseen query styles. In this paper, we propose the Hypernetwork-driven Style-adaptive Retrieval (Hystar), a lightweight framework that dynamically adapts model weights to each query's style. Hystar employs a hypernetwork to generate singular-value perturbations ($ΔS$) for attention layers, enabling flexible per-input adaptation, while static singular-value offsets on MLP layers ensure cross-style stability. To better handle semantic confusions across styles, we design StyleNCE as part of Hystar, an optimal-transport-weighted contrastive loss that emphasizes hard cross-style negatives. Extensive experiments on multi-style retrieval and cross-style classification benchmarks demonstrate that Hystar consistently outperforms strong baselines, achieving state-of-the-art performance while being parameter-efficient and stable across styles.

Problem

Research questions and friction points this paper is trying to address.

query-based image retrieval

style heterogeneity

distribution shift

zero-shot retrieval

cross-style adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypernetwork

Dynamic SVD Modulation

Style-adaptive Retrieval