Incentivizing Generative Zero-Shot Learning via Outcome-Reward Reinforcement Learning with Visual Cues

πŸ“… 2026-03-22
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses a key limitation in generative zero-shot learning, where existing methods often produce task-agnostic visual features that struggle to discriminate between semantically similar yet visually distinct categories, thereby constraining performance. To overcome this, the authors propose the RLVC framework, which innovatively integrates outcome-reward-driven reinforcement learning with class-specific visual cues to guide the generator’s self-evolution toward synthesizing task-relevant features that align closely with visual prototypes. Additionally, a cold-start strategy is introduced to stabilize training dynamics. Evaluated on three standard zero-shot learning benchmarks, the proposed method achieves state-of-the-art results, yielding an average improvement of 4.7% over prior approaches.

Technology Category

Application Category

πŸ“ Abstract
Recent advances in zero-shot learning (ZSL) have demonstrated the potential of generative models. Typically, generative ZSL synthesizes visual features conditioned on semantic prototypes to model the data distribution of unseen classes, followed by training a classifier on the synthesized data. However, the synthesized features often remain task-agnostic, leading to degraded performance. Moreover, inferring a faithful distribution from semantic prototypes alone is insufficient for classes that are semantically similar but visually distinct. To address these and advance ZSL, we propose RLVC, an outcome-reward reinforcement learning RL framework with visual cues for generative ZSL. At its core, RL empowers the generative model to self-evolve, implicitly enhancing its generation capability. In particular, RLVC updates the generative model using an outcome-based reward, encouraging the synthesis of task-relevant features. Furthermore, we introduce class-wise visual cues that (i) align synthesized features with visual prototypes and (ii) stabilize the RL training updates. For the training process, we present a novel cold-start strategy. Comprehensive experiments and analyses on three prevalent ZSL benchmarks demonstrate that RLVC achieves state-of-the-art results with a 4.7% gain.
Problem

Research questions and friction points this paper is trying to address.

zero-shot learning
generative models
visual features
semantic prototypes
task-agnostic synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

reinforcement learning
zero-shot learning
generative model
visual cues
outcome-reward
πŸ”Ž Similar Papers
No similar papers found.