Pegasus: A Universal Framework for Scalable Deep Learning Inference on the Dataplane

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing intelligent data planes (IDPs) rely on match-action table (MAT) abstractions for deep learning (DL) inference, suffering from low accuracy, limited model capacity, and poor generality. This paper proposes a general-purpose DL inference framework for programmable data planes (e.g., P4-enabled switches), introducing the novel “Partition-Map-SumReduce” triadic primitive to overcome MAT’s expressiveness bottleneck for AI workloads. By co-designing primitive fusion, full-precision weights with fixed-point activations, fuzzy matching, custom dataflow compilation, and quantization-aware optimization, our approach achieves high-accuracy, scalable, and model-agnostic hardware inference. It supports diverse architectures—including MLPs, RNNs, CNNs, and autoencoders—without structural modification. Compared to state-of-the-art IDP solutions, our framework improves average inference accuracy by 22.8%, increases supported model parameter count by 248×, and scales input dimensionality by 212×.

Technology Category

Application Category

📝 Abstract
The paradigm of Intelligent DataPlane (IDP) embeds deep learning (DL) models on the network dataplane to enable intelligent traffic analysis at line-speed. However, the current use of the match-action table (MAT) abstraction on the dataplane is misaligned with DL inference, leading to several key limitations, including accuracy degradation, limited scale, and lack of generality. This paper proposes Pegasus to address these limitations. Pegasus translates DL operations into three dataplane-oriented primitives to achieve generality: Partition, Map, and SumReduce. Specifically, Partition"divides"high-dimensional features into multiple low-dimensional vectors, making them more suitable for the dataplane; Map"conquers"computations on the low-dimensional vectors in parallel with the technique of fuzzy matching, while SumReduce"combines"the computation results. Additionally, Pegasus employs Primitive Fusion to merge computations, improving scalability. Finally, Pegasus adopts full precision weights with fixed-point activations to improve accuracy. Our implementation on a P4 switch demonstrates that Pegasus can effectively support various types of DL models, including Multi-Layer Perceptron (MLP), Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), and AutoEncoder models on the dataplane. Meanwhile, Pegasus outperforms state-of-the-art approaches with an average accuracy improvement of up to 22.8%, along with up to 248x larger model size and 212x larger input scale.
Problem

Research questions and friction points this paper is trying to address.

Misalignment between DL inference and MAT abstraction on dataplane
Accuracy degradation and limited scale in current IDP approaches
Lack of generality in supporting diverse DL models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Translates DL operations into dataplane-oriented primitives
Uses Partition, Map, SumReduce for parallel computation
Employs Primitive Fusion and full precision weights
🔎 Similar Papers
No similar papers found.
Yinchao Zhang
Yinchao Zhang
Tsinghua University
Intelligent DataplaneProgrammable Networks
Su Yao
Su Yao
Tsinghua University
BlockchainCollaborative learning
Yong Feng
Yong Feng
Swinburne University of Technology, Australia
Sliding Mode Control - Electrical Engineering - Control and Observers
K
Kang Chen
Beijing National Research Center for Information Science and Technology
T
Tong Li
Renmin University of China
Zhuotao Liu
Zhuotao Liu
Tsinghua University
Data/AI Privacy and SecurityDatacenter NetworkingSecure InternetBlockchain/Web3.0 Infra
Y
Yi Zhao
Beijing Institute of Technology
L
Lexuan Zhang
Tsinghua University
X
Xiangyu Gao
Tsinghua University
F
Feng Xiong
Beihang University
Q
Qi Li
Tsinghua University
K
Ke Xu
Tsinghua University