HyLoVQA: Dynamic Hypernetwork-Generated Low-Rank Adaptation for Continual Visual Question Answering

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the challenge of continual visual question answering, where shared parameter updates often lead to interference across tasks and objects, hindering the simultaneous adaptation to new tasks and retention of old knowledge under non-stationary data streams. To mitigate this, the authors propose a novel paradigm that constructs a drift-resistant memory bank of anchors encoding both visual objects and textual task representations. A hypernetwork dynamically generates lightweight, low-rank adapters (LoRAs) conditioned on retrieved anchors to enable precise and efficient adaptation to the current task. Additionally, a semantic–parameter space alignment loss is introduced to reduce interference and enhance knowledge stability. The proposed method significantly outperforms state-of-the-art approaches on both VQA v2 and NExT-QA benchmarks under standard and compositional continual learning settings.

📝 Abstract

Continual Visual Question Answering (VQA) requires learning from non-stationary streams of visual inputs and questions while preserving past knowledge. Most prior methods adapt by updating a largely shared parameter set. This often leads to cross-level task interference, hindering accurate adaptation to the current task and object. To address this limitation, we propose HyLoVQA. It maintains a drift-resilient memory bank of anchors. The bank stores the content of visual objects and textual tasks, and they are updated using current input features. Conditioned on retrieved anchors, a hypernetwork generates lightweight Low-Rank Adaptation (LoRA) adapters. This ensures parameter efficiency, allowing the model to adapt to each task and object dynamically. Additionally, we formulate an alignment loss that aligns semantic discrepancies in the feature space with functional changes in the parameter space, thereby constraining LoRA adapters to remain focused on the current task and object. Extensive experiments on VQA v2 and NExT-QA under both standard and compositional settings demonstrate the superiority of HyLoVQA over prior state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Continual Learning

Visual Question Answering

Task Interference

Knowledge Preservation

Non-stationary Streams

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypernetwork

Low-Rank Adaptation

Continual Learning