π€ AI Summary
This work addresses representation collapse and poor performance in online/continual learning within non-contrastive self-supervised learning. Inspired by the hippocampal temporal prediction hypothesis, we propose PhiNetβthe first framework to incorporate a hippocampo-neocortical complementary learning system into non-contrastive learning. PhiNet employs a momentum encoder to emulate slow neocortical learning and a dual-branch predictor to model dynamic CA1-like temporal prediction; it further introduces a novel raw-representation prediction mechanism to enhance representation stability. We validate its biological plausibility through dynamical systems analysis. Experiments demonstrate that PhiNet significantly improves robustness to weight decay and outperforms SimSiam in both online and continual learning settings, effectively mitigating representation collapse.
π Abstract
SimSiam is a prominent self-supervised learning method that achieves impressive results in various vision tasks under static environments. However, it has two critical issues: high sensitivity to hyperparameters, especially weight decay, and unsatisfactory performance in online and continual learning, where neuroscientists believe that powerful memory functions are necessary, as in brains. In this paper, we propose PhiNet, inspired by a hippocampal model based on the temporal prediction hypothesis. Unlike SimSiam, which aligns two augmented views of the original image, PhiNet integrates an additional predictor block that estimates the original image representation to imitate the CA1 region in the hippocampus. Moreover, we model the neocortex inspired by the Complementary Learning Systems theory with a momentum encoder block as a slow learner, which works as long-term memory. We demonstrate through analysing the learning dynamics that PhiNet benefits from the additional predictor to prevent the complete collapse of learned representations, a notorious challenge in non-contrastive learning. This dynamics analysis may partially corroborate why this hippocampal model is biologically plausible. Experimental results demonstrate that PhiNet is more robust to weight decay and performs better than SimSiam in memory-intensive tasks like online and continual learning.