GenZSL: Generative Zero-Shot Learning Via Inductive Variational Autoencoder

📅 2025-05-17

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Existing generative zero-shot learning (ZSL) methods heavily rely on manually annotated, strong semantic vectors—e.g., attribute annotations—leading to poor generation quality and weak generalization. To address this, we propose a novel generative framework based on an inductive variational autoencoder (Inductive VAE): instead of generating visual features “from scratch,” it inductively synthesizes target-class features by leveraging CLIP-derived class-name text embeddings and transferring knowledge from semantically similar seen classes. We further introduce semantic diversity regularization and a target-class-guided information-gain reconstruction loss to enhance discriminability and robustness of generated features. Our method achieves a 24.7% absolute accuracy gain over f-VAEGAN on AWA2, accelerates training by 60×, and attains state-of-the-art performance across three standard ZSL benchmarks.

Technology Category

Application Category

📝 Abstract

Remarkable progress in zero-shot learning (ZSL) has been achieved using generative models. However, existing generative ZSL methods merely generate (imagine) the visual features from scratch guided by the strong class semantic vectors annotated by experts, resulting in suboptimal generative performance and limited scene generalization. To address these and advance ZSL, we propose an inductive variational autoencoder for generative zero-shot learning, dubbed GenZSL. Mimicking human-level concept learning, GenZSL operates by inducting new class samples from similar seen classes using weak class semantic vectors derived from target class names (i.e., CLIP text embedding). To ensure the generation of informative samples for training an effective ZSL classifier, our GenZSL incorporates two key strategies. Firstly, it employs class diversity promotion to enhance the diversity of class semantic vectors. Secondly, it utilizes target class-guided information boosting criteria to optimize the model. Extensive experiments conducted on three popular benchmark datasets showcase the superiority and potential of our GenZSL with significant efficacy and efficiency over f-VAEGAN, e.g., 24.7% performance gains and more than $60 imes$ faster training speed on AWA2. Codes are available at https://github.com/shiming-chen/GenZSL.

Problem

Research questions and friction points this paper is trying to address.

Improving generative performance in zero-shot learning

Enhancing scene generalization with weak semantic vectors

Boosting classifier training via diversity and optimization strategies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Inductive variational autoencoder for ZSL

Class diversity promotion strategy

Target class-guided information boosting

🔎 Similar Papers

Model Synthesis for Zero-Shot Model Attribution