🤖 AI Summary
This work addresses the stability challenge of modular Global Workspace (GW) architectures in multi-region recurrent neural networks (RNNs), proposing a novel, relaxed stability criterion grounded in contraction analysis—the first provably guaranteed stability condition for GW-type multi-region RNNs. Methodologically, we integrate recursive RNN construction, sparse graph-based topology modeling, and modular training to design a new modular RNN architecture featuring sparse inter-module connections. Our key contributions are threefold: (i) establishing a theoretical link among modularity, sparsity, and robustness; (ii) achieving higher test accuracy with fewer parameters on sequence modeling tasks; and (iii) demonstrating exceptional robustness under random subnetwork removal—surpassing prior stable RNNs in both accuracy and resilience. This advances the state of the art in provably stable, modular RNN design.
📝 Abstract
To push forward the important emerging research field surrounding multi-area recurrent neural networks (RNNs), we expand theoretically and empirically on the provably stable RNNs of RNNs introduced by Kozachkov et al. in"RNNs of RNNs: Recursive Construction of Stable Assemblies of Recurrent Neural Networks". We prove relaxed stability conditions for salient special cases of this architecture, most notably for a global workspace modular structure. We then demonstrate empirical success for Global Workspace Sparse Combo Nets with a small number of trainable parameters, not only through strong overall test performance but also greater resilience to removal of individual subnetworks. These empirical results for the global workspace inter-area topology are contingent on stability preservation, highlighting the relevance of our theoretical work for enabling modular RNN success. Further, by exploring sparsity in the connectivity structure between different subnetwork modules more broadly, we improve the state of the art performance for stable RNNs on benchmark sequence processing tasks, thus underscoring the general utility of specialized graph structures for multi-area RNNs.