🤖 AI Summary
This work addresses the challenges of expert parameter allocation and insufficient representational capacity in general continual learning under single-pass, non-stationary data streams without explicit task boundaries. Inspired by the hierarchical memory system of the fruit fly, the problem is decoupled into expert routing and capability enhancement. The authors propose a stochastic expanded disentangled router to enable sample-level expert activation, coupled with a temporally ensembled output head that dynamically adjusts decision boundaries. This study introduces, for the first time, a brain-inspired stochastic expanded routing mechanism and a temporally ensembled expert architecture, achieving significant performance gains while maintaining parameter efficiency. The method outperforms the current state of the art by 11.23%, 12.43%, and 7.62% on CIFAR-100, ImageNet-R, and CUB-200, respectively.
📝 Abstract
General continual learning (GCL) challenges intelligent systems to learn from single-pass, non-stationary data streams without clear task boundaries. While recent advances in continual parameter-efficient tuning (PET) of pretrained models show promise, they typically rely on multiple training epochs and explicit task cues, limiting their effectiveness in GCL scenarios. Moreover, existing methods often lack targeted design and fail to address two fundamental challenges in continual PET: how to allocate expert parameters to evolving data distributions, and how to improve their representational capacity under limited supervision. Inspired by the fruit fly's hierarchical memory system characterized by sparse expansion and modular ensembles, we propose FlyPrompt, a brain-inspired framework that decomposes GCL into two subproblems: expert routing and expert competence improvement. FlyPrompt introduces a randomly expanded analytic router for instance-level expert activation and a temporal ensemble of output heads to dynamically adapt decision boundaries over time. Extensive theoretical and empirical evaluations demonstrate FlyPrompt's superior performance, achieving up to 11.23%, 12.43%, and 7.62% gains over state-of-the-art baselines on CIFAR-100, ImageNet-R, and CUB-200, respectively. Our source code is available at https://github.com/AnAppleCore/FlyGCL.