Bridging Expressivity and Scalability with Adaptive Unitary SSMs

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
State space models (SSMs) efficiently model long sequences but suffer from time-invariance and real-valued recurrent dynamics, limiting their ability to recognize certain regular languages—e.g., modular counting and parity checking. Method: We propose Adaptive Unitary State Space Models (AUSSMs), introducing the first input-dependent skew-symmetric unitary recurrence mechanism. This design enables exact simulation of solvable-group automata and precise modular arithmetic. AUSSMs integrate separable convolutions with CUDA-optimized kernels, preserving linear-time complexity while enabling efficient parallel training. Contribution/Results: Theoretically, AUSSMs overcome a fundamental expressivity barrier of SSMs in formal language recognition. Empirically, they significantly outperform existing SSMs on algorithmic tasks—including parity detection and modular arithmetic—and achieve state-of-the-art performance on real-world long-sequence classification benchmarks. AUSSMs thus unify symbolic reasoning and continuous sequential modeling within a single, scalable architecture.

Technology Category

Application Category

📝 Abstract
Recent work has revealed that state space models (SSMs), while efficient for long-sequence processing, are fundamentally limited in their ability to represent formal languages particularly due to time-invariant and real-valued recurrence structures. In this work, we draw inspiration from adaptive and structured dynamics observed in biological neural systems and introduce the Adaptive Unitary State Space Model (AUSSM)- a novel class of SSMs that leverages skew-symmetric, input-dependent recurrence to achieve unitary evolution and high expressive power. Using algebraic automata theory, we prove that AUSSM can perform modulo counting and simulate solvable group automata at finite precision, enabling SSMs to model a broad class of regular languages that are out of reach for other SSM architectures. To overcome the practical inefficiencies of adaptive recurrence, we develop a separable convolution formulation and a CUDA implementation that enables scalable parallel training. Empirically, we show that AUSSM when interleaved with Mamba outperform prior SSMs on formal algorithmic tasks such as parity and modular arithmetic, and achieve competent performance on real-world long time-series classification benchmarks. Our results demonstrate that adaptive unitary recurrence provides a powerful and efficient inductive bias for both symbolic and continuous sequence modeling.
Problem

Research questions and friction points this paper is trying to address.

Enhancing SSMs to represent complex formal languages
Overcoming time-invariant limitations in state space models
Improving scalability and efficiency in adaptive recurrence SSMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Unitary State Space Model (AUSSM)
Skew-symmetric input-dependent recurrence
Scalable CUDA implementation for parallel training
🔎 Similar Papers
No similar papers found.
A
Arjun Karuvally
Salk Institute for Biological Studies
Franz Nowak
Franz Nowak
ETH Zurich
Machine LearningNatural Language Processing
A
Anderson T. Keller
The Kempner Institute for the Study of Natural and Artificial Intelligence at Harvard University
C
Carmen Amo Alonso
Computer Science Department, Stanford University
T
Terrence J. Sejnowski
Salk Institute for Biological Studies
H
Hava T. Siegelmann
University of Massachusetts Amherst