When Routers, Switches and Interconnects Compute: A processing-in-interconnect Paradigm for Scalable Neuromorphic AI

📅 2025-08-26

📈 Citations: 0

✨ Influential: 0

career value

246K/year

🤖 AI Summary

To address the high energy consumption and low speed bottlenecks imposed by interconnect structures in large-scale neuromorphic computing, this paper proposes the “Processing-in-Interconnect” (π²) paradigm. It is the first to exploit intrinsic hardware primitives—such as latency, causality, broadcasting, buffering, traffic shaping, and packet loss—natively embedded in routing/switching fabrics to implement neuronal dynamics and synaptic plasticity *in situ*, thereby transforming the interconnect infrastructure itself into a computational substrate. Furthermore, we introduce a performance-preserving knowledge distillation-based cross-mapping methodology to enable efficient deployment of conventional neural networks onto π² architectures. Theoretical analysis demonstrates that π²’s energy efficiency improves with increasing bandwidth, enabling scalable, brain-scale inference at only hundreds of watts—significantly outperforming state-of-the-art neuromorphic platforms.

Technology Category

Application Category

📝 Abstract

Routing, switching, and the interconnect fabric are essential for large-scale neuromorphic computing. While this fabric only plays a supporting role in the process of computing, for large AI workloads it ultimately determines energy consumption and speed. In this paper, we address this bottleneck by asking: (a) What computing paradigms are inherent in existing routing, switching, and interconnect systems, and how can they be used to implement a processing-in-Interconnect (π^2) computing paradigm? and (b) leveraging current and future interconnect trends, how will a π^2 system's performance scale compared to other neuromorphic architectures? For (a), we show that operations required for typical AI workloads can be mapped onto delays, causality, time-outs, packet drop, and broadcast operations -- primitives already implemented in packet-switching and packet-routing hardware. We show that existing buffering and traffic-shaping embedded algorithms can be leveraged to implement neuron models and synaptic operations. Additionally, a knowledge-distillation framework can train and cross-map well-established neural network topologies onto $π^2$ without degrading generalization performance. For (b), analytical modeling shows that, unlike other neuromorphic platforms, the energy scaling of $π^2$ improves with interconnect bandwidth and energy efficiency. We predict that by leveraging trends in interconnect technology, a π^2 architecture can be more easily scaled to execute brain-scale AI inference workloads with power consumption levels in the range of hundreds of watts.

Problem

Research questions and friction points this paper is trying to address.

Leveraging interconnect fabric for scalable neuromorphic AI computation

Mapping AI operations onto existing routing and switching primitives

Improving energy scaling with interconnect bandwidth and efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging packet-switching primitives for AI computation

Using buffering algorithms to implement neural operations

Scaling neuromorphic systems via interconnect bandwidth efficiency

🔎 Similar Papers

No similar papers found.