🤖 AI Summary
Existing zero-shot hashing (ZSH) methods neglect local structural relationships between representations and attributes and struggle to model continuous-valued attributes, thereby limiting cross-category knowledge transfer and generalization. To address these limitations, we propose a three-level attribute exploration framework: (1) an attribute prototype network for regressing continuous attributes; (2) multi-granularity consistency constraints—operating at point-, pair-, and class-levels—to explicitly model locally transferable structures; and (3) contrastive learning to capture attribute contextual semantics, replacing instance-agnostic optimization. Our approach is the first to jointly optimize continuous attribute regression and multi-granularity structural consistency. Extensive experiments on mainstream ZSH benchmarks demonstrate significant improvements over state-of-the-art methods, with particularly pronounced gains as the number of unseen categories increases.
📝 Abstract
Zero-shot hashing (ZSH) has shown excellent success owing to its efficiency and generalization in large-scale retrieval scenarios. While considerable success has been achieved, there still exist urgent limitations. Existing works ignore the locality relationships of representations and attributes, which have effective transferability between seeable classes and unseeable classes. Also, the continuous-value attributes are not fully harnessed. In response, we conduct a COMprehensive Attribute Exploration for ZSH, named COMAE, which depicts the relationships from seen classes to unseen ones through three meticulously designed explorations, i.e., point-wise, pair-wise and class-wise consistency constraints. By regressing attributes from the proposed attribute prototype network, COMAE learns the local features that are relevant to the visual attributes. Then COMAE utilizes contrastive learning to comprehensively depict the context of attributes, rather than instance-independent optimization. Finally, the class-wise constraint is designed to cohesively learn the hash code, image representation, and visual attributes more effectively. Experimental results on the popular ZSH datasets demonstrate that COMAE outperforms state-of-the-art hashing techniques, especially in scenarios with a larger number of unseen label classes.