SpliDT: Partitioned Decision Trees for Scalable Stateful Inference at Line Rate

📅 2025-08-30

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Existing data-plane decision trees are constrained by hardware resources, requiring precomputed, fixed-size feature sets—severely limiting model accuracy and scalability. This paper proposes a partitioned decision tree system for programmable data planes, enabling stateful streaming inference over sliding windows. Our approach addresses these limitations through three key contributions: (1) a subtree-based feature partitioning mechanism for distributed inference; (2) a loopback channel to enable cross-partition reuse of registers and match keys; and (3) a joint optimization framework that co-designs feature allocation and tree topology. Implemented in P4, the system integrates match-action tables, stateful registers, loopback control, and sliding-window feature extraction, supported by custom training and design-space exploration tools. Evaluation on real-world datasets shows it supports stateful feature sets five times larger than prior methods, achieves comparable detection latency, and incurs less than 0.05% loopback overhead under one million concurrent flows.

Technology Category

Application Category

📝 Abstract

Machine learning (ML) is increasingly being deployed in programmable data planes (switches and SmartNICs) to enable real-time traffic analysis, security monitoring, and in-network decision-making. Decision trees (DTs) are particularly well-suited for these tasks due to their interpretability and compatibility with data-plane architectures, i.e., match-action tables (MATs). However, existing in-network DT implementations are constrained by the need to compute all input features upfront, forcing models to rely on a small, fixed set of features per flow. This significantly limits model accuracy and scalability under stringent hardware resource constraints. We present SPLIDT, a system that rethinks DT deployment in the data plane by enabling partitioned inference over sliding windows of packets. SPLIDT introduces two key innovations: (1) it assigns distinct, variable feature sets to individual sub-trees of a DT, grouped into partitions, and (2) it leverages an in-band control channel (via recirculation) to reuse data-plane resources (both stateful registers and match keys) across partitions at line rate. These insights allow SPLIDT to scale the number of stateful features a model can use without exceeding hardware limits. To support this architecture, SPLIDT incorporates a custom training and design-space exploration (DSE) framework that jointly optimizes feature allocation, tree partitioning, and DT model depth. Evaluation across multiple real-world datasets shows that SPLIDT achieves higher accuracy while supporting up to 5x more stateful features than prior approaches (e.g., NetBeacon and Leo). It maintains the same low time-to-detection (TTD) as these systems, while scaling to millions of flows with minimal recirculation overhead (<0.05%).

Problem

Research questions and friction points this paper is trying to address.

Enables scalable stateful inference in data planes

Overcomes hardware constraints on feature usage

Supports partitioned decision trees for line rate

Innovation

Methods, ideas, or system contributions that make the work stand out.

Partitioned decision trees for scalable stateful inference

Variable feature sets assigned to individual sub-trees

In-band control channel reuses data-plane resources

🔎 Similar Papers

No similar papers found.