CPC-VAR:Continual Personalized and Compositional Generation in Visual Autoregressive Models

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

200K/year
🤖 AI Summary
This work addresses the challenges of catastrophic forgetting and entangled feature representations with inconsistent attributes that plague existing visual autoregressive models in continual personalized generation. It presents the first systematic study of this setting and introduces a unified framework that requires no model expansion. The approach leverages Gradient-guided Concept Neuron Selection (GCNS) to enable forgetting-resistant continual learning for individual concepts, while incorporating context-aware multi-branch feature modeling and a spatially conditioned local cross-attention fusion mechanism to support disentangled and controllable multi-concept composition. Experiments demonstrate that the proposed method significantly outperforms current baselines in both long-sequence continual learning and multi-concept image synthesis, yielding more accurate generations with higher attribute consistency.
📝 Abstract
Visual autoregressive (VAR) models have recently emerged as an efficient paradigm for text-to-image generation. Despite their strong generative capability, existing VAR-based personalization methods remain limited to static settings, failing to accommodate evolving user demands. In particular, sequential concept learning leads to severe catastrophic forgetting, while multi-concept synthesis often suffers from feature entanglement and attribute inconsistency. In this work, we present the first systematic study of continual personalized generation in VAR models. We identify two key challenges: (i) preserving previously learned concepts during sequential customization, and (ii) composing multiple personalized concepts in a controllable manner. To address these issues, we propose a unified framework with two core components. For continual single-concept learning, we introduce Gradient-based Concept Neuron Selection (GCNS), which identifies concept-relevant neurons and constrains only conflicting parameters across tasks, effectively mitigating forgetting without additional model expansion. For multi-concept synthesis, we propose a context-aware composition strategy that performs multi-branch feature modeling and localized cross-attention fusion guided by spatial conditions, enabling precise and disentangled concept composition. Extensive experiments demonstrate that our method significantly improves performance in long-sequence continual personalization while achieving superior results in multi-concept image synthesis compared to existing baselines. These findings highlight the potential of VAR models for scalable and controllable personalized generation.
Problem

Research questions and friction points this paper is trying to address.

continual learning
personalized generation
catastrophic forgetting
concept composition
visual autoregressive models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Continual Learning
Personalized Generation
Visual Autoregressive Models
Concept Composition
Catastrophic Forgetting
🔎 Similar Papers
No similar papers found.