HyLoVQA: Dynamic Hypernetwork-Generated Low-Rank Adaptation for Continual Visual Question Answering

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

200K/year
🤖 AI Summary
This work addresses the challenge of continual visual question answering, where shared parameter updates often lead to interference across tasks and objects, hindering the simultaneous adaptation to new tasks and retention of old knowledge under non-stationary data streams. To mitigate this, the authors propose a novel paradigm that constructs a drift-resistant memory bank of anchors encoding both visual objects and textual task representations. A hypernetwork dynamically generates lightweight, low-rank adapters (LoRAs) conditioned on retrieved anchors to enable precise and efficient adaptation to the current task. Additionally, a semantic–parameter space alignment loss is introduced to reduce interference and enhance knowledge stability. The proposed method significantly outperforms state-of-the-art approaches on both VQA v2 and NExT-QA benchmarks under standard and compositional continual learning settings.
📝 Abstract
Continual Visual Question Answering (VQA) requires learning from non-stationary streams of visual inputs and questions while preserving past knowledge. Most prior methods adapt by updating a largely shared parameter set. This often leads to cross-level task interference, hindering accurate adaptation to the current task and object. To address this limitation, we propose HyLoVQA. It maintains a drift-resilient memory bank of anchors. The bank stores the content of visual objects and textual tasks, and they are updated using current input features. Conditioned on retrieved anchors, a hypernetwork generates lightweight Low-Rank Adaptation (LoRA) adapters. This ensures parameter efficiency, allowing the model to adapt to each task and object dynamically. Additionally, we formulate an alignment loss that aligns semantic discrepancies in the feature space with functional changes in the parameter space, thereby constraining LoRA adapters to remain focused on the current task and object. Extensive experiments on VQA v2 and NExT-QA under both standard and compositional settings demonstrate the superiority of HyLoVQA over prior state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Continual Learning
Visual Question Answering
Task Interference
Knowledge Preservation
Non-stationary Streams
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypernetwork
Low-Rank Adaptation
Continual Learning
Visual Question Answering
Memory Bank
🔎 Similar Papers