🤖 AI Summary
To address the high energy consumption and low speed bottlenecks imposed by interconnect structures in large-scale neuromorphic computing, this paper proposes the “Processing-in-Interconnect” (π²) paradigm. It is the first to exploit intrinsic hardware primitives—such as latency, causality, broadcasting, buffering, traffic shaping, and packet loss—natively embedded in routing/switching fabrics to implement neuronal dynamics and synaptic plasticity *in situ*, thereby transforming the interconnect infrastructure itself into a computational substrate. Furthermore, we introduce a performance-preserving knowledge distillation-based cross-mapping methodology to enable efficient deployment of conventional neural networks onto π² architectures. Theoretical analysis demonstrates that π²’s energy efficiency improves with increasing bandwidth, enabling scalable, brain-scale inference at only hundreds of watts—significantly outperforming state-of-the-art neuromorphic platforms.
📝 Abstract
Routing, switching, and the interconnect fabric are essential for large-scale neuromorphic computing. While this fabric only plays a supporting role in the process of computing, for large AI workloads it ultimately determines energy consumption and speed. In this paper, we address this bottleneck by asking: (a) What computing paradigms are inherent in existing routing, switching, and interconnect systems, and how can they be used to implement a processing-in-Interconnect (π^2) computing paradigm? and (b) leveraging current and future interconnect trends, how will a π^2 system's performance scale compared to other neuromorphic architectures? For (a), we show that operations required for typical AI workloads can be mapped onto delays, causality, time-outs, packet drop, and broadcast operations -- primitives already implemented in packet-switching and packet-routing hardware. We show that existing buffering and traffic-shaping embedded algorithms can be leveraged to implement neuron models and synaptic operations. Additionally, a knowledge-distillation framework can train and cross-map well-established neural network topologies onto $π^2$ without degrading generalization performance. For (b), analytical modeling shows that, unlike other neuromorphic platforms, the energy scaling of $π^2$ improves with interconnect bandwidth and energy efficiency. We predict that by leveraging trends in interconnect technology, a π^2 architecture can be more easily scaled to execute brain-scale AI inference workloads with power consumption levels in the range of hundreds of watts.