Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Few-Shot Class-Incremental Learning (FSCIL) confronts the dual challenges of the stability-plasticity dilemma and severe scarcity of novel-class samples. To address these, we propose a two-level framework featuring co-evolution of diffusion models and classifiers. Our method introduces a novel reward-aligned mutual enhancement mechanism, employing a dynamic multi-level reward function conditioned on classifier states to jointly optimize generative diversity and discriminative robustness. Further, we design a unified generative-discriminative training pipeline integrating prototype-anchored MMD regularization, dimension-wise variance matching, confidence recalibration, and cross-session confusion awareness. Evaluated on mainstream FSCIL benchmarks, our approach achieves state-of-the-art performance—significantly improving both old-class knowledge retention and new-class recognition accuracy. Extensive experiments validate that the co-evolutionary paradigm effectively mitigates catastrophic forgetting while enhancing generalization capability in data-scarce incremental settings.

Technology Category

Application Category

📝 Abstract

Few-Shot Class-Incremental Learning (FSCIL) challenges models to sequentially learn new classes from minimal examples without forgetting prior knowledge, a task complicated by the stability-plasticity dilemma and data scarcity. Current FSCIL methods often struggle with generalization due to their reliance on limited datasets. While diffusion models offer a path for data augmentation, their direct application can lead to semantic misalignment or ineffective guidance. This paper introduces Diffusion-Classifier Synergy (DCS), a novel framework that establishes a mutual boosting loop between diffusion model and FSCIL classifier. DCS utilizes a reward-aligned learning strategy, where a dynamic, multi-faceted reward function derived from the classifier's state directs the diffusion model. This reward system operates at two levels: the feature level ensures semantic coherence and diversity using prototype-anchored maximum mean discrepancy and dimension-wise variance matching, while the logits level promotes exploratory image generation and enhances inter-class discriminability through confidence recalibration and cross-session confusion-aware mechanisms. This co-evolutionary process, where generated images refine the classifier and an improved classifier state yields better reward signals, demonstrably achieves state-of-the-art performance on FSCIL benchmarks, significantly enhancing both knowledge retention and new class learning.

Problem

Research questions and friction points this paper is trying to address.

Addressing catastrophic forgetting in few-shot class-incremental learning scenarios

Overcoming data scarcity and semantic misalignment in diffusion-based augmentation

Resolving stability-plasticity dilemma through classifier-diffusion model co-evolution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mutual boosting loop between diffusion model and classifier

Dynamic multi-faceted reward function guides diffusion generation

Feature and logits level mechanisms ensure semantic alignment

🔎 Similar Papers

Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation