๐ค AI Summary
To address the challenges of real-time, fine-grained feature extraction from encrypted traffic on Tbps-scale links and the high latency (>100 ms) inherent in traditional CPU-centric architectures, this paper proposes a data-plane-native feature extraction paradigm with direct GPU integration. Specifically, flow-level features are extracted at line rate on a P4-programmable switch (Intel Tofino), and subsequently transferred to GPUs via RDMA and GPUDirect RDMAโenabling zero-copy, zero-CPU-intervention delivery. GPU-accelerated feature processing and AI inference are performed synchronously. This design achieves, for the first time, an end-to-end sub-20-ms (<20 ms) closed-loop analysis latency, with a per-port throughput of 31 million feature vectors per second and support for 524,000 concurrent flows. By bypassing control-plane bottlenecks, the system delivers scalable, ML-driven real-time network monitoring for ultra-high-speed networks.
๐ Abstract
Real-time traffic monitoring is critical for network operators to ensure performance, security, and visibility, especially as encryption becomes the norm. AI and ML have emerged as powerful tools to create deeper insights from network traffic, but collecting the fine-grained features needed at terabit speeds remains a major bottleneck. We introduce Direct Feature Access (DFA): a high-speed telemetry system that extracts flow features at line rate using P4-programmable data planes, and delivers them directly to GPUs via RDMA and GPUDirect, completely bypassing the ML server's CPU. DFA enables feature enrichment and immediate inference on GPUs, eliminating traditional control plane bottlenecks and dramatically reducing latency. We implement DFA on Intel Tofino switches and NVIDIA A100 GPUs, achieving extraction and delivery of over 31 million feature vectors per second, supporting 524,000 flows within sub-20 ms monitoring periods, on a single port. DFA unlocks scalable, real-time, ML-driven traffic analysis at terabit speeds, pushing the frontier of what is possible for next-generation network monitoring.