🤖 AI Summary
Recurrent neural networks (RNNs) trained on the same task often exhibit behavioral equivalence yet divergent internal representations—a phenomenon termed “solution degeneracy”—which impedes model interpretability and inference of underlying neural mechanisms.
Method: We introduce the first unified, multi-level framework quantifying and controlling degeneracy across behavioral, neural dynamical, and weight-space domains, grounded in large-scale experiments involving 3,400 RNNs trained on four canonical neuroscience tasks. Our approach integrates multi-scale behavioral similarity metrics, manifold alignment, weight-distance analysis, and systematic hyperparameter sweeps.
Contribution/Results: We identify and empirically validate the “inverse principle”: increased task complexity reduces dynamical degeneracy but exacerbates weight-space degeneracy; conversely, scaling network size or imposing structural regularization systematically suppresses degeneracy across all three levels. These findings provide an empirical foundation and principled design guidelines for developing interpretable, biologically plausible RNNs.
📝 Abstract
Task-trained recurrent neural networks (RNNs) are widely used in neuroscience and machine learning to model dynamical computations. To gain mechanistic insight into how neural systems solve tasks, prior work often reverse-engineers individual trained networks. However, different RNNs trained on the same task and achieving similar performance can exhibit strikingly different internal solutions-a phenomenon known as solution degeneracy. Here, we develop a unified framework to systematically quantify and control solution degeneracy across three levels: behavior, neural dynamics, and weight space. We apply this framework to 3,400 RNNs trained on four neuroscience-relevant tasks-flip-flop memory, sine wave generation, delayed discrimination, and path integration-while systematically varying task complexity, learning regime, network size, and regularization. We find that higher task complexity and stronger feature learning reduce degeneracy in neural dynamics but increase it in weight space, with mixed effects on behavior. In contrast, larger networks and structural regularization reduce degeneracy at all three levels. These findings empirically validate the Contravariance Principle and provide practical guidance for researchers aiming to tailor RNN solutions-whether to uncover shared neural mechanisms or to model individual variability observed in biological systems. This work provides a principled framework for quantifying and controlling solution degeneracy in task-trained RNNs, offering new tools for building more interpretable and biologically grounded models of neural computation.