CATO: End-to-End Optimization of ML-Based Traffic Analysis Pipelines

📅 2024-02-08

🏛️ Symposium on Networked Systems Design and Implementation

📈 Citations: 1

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the fundamental trade-off between prediction accuracy and system overhead (e.g., latency, throughput) in ML-driven network traffic analysis under real-world deployment constraints. We propose the first end-to-end, multi-objective Bayesian optimization framework for service pipelines, jointly optimizing feature engineering, ML model selection, and system-level parameters—including batch size and concurrency—to automatically generate Pareto-optimal, production-ready configurations. Our approach uniquely enables co-compilation of model accuracy and system efficiency: it achieves zero accuracy loss while reducing inference latency by up to 3,600× and improving throughput by 3.7×. Compared to state-of-the-art feature optimization methods, our framework significantly enhances the practicality and industrial deployability of network traffic analysis models.

Technology Category

Application Category

📝 Abstract

Machine learning has shown tremendous potential for improving the capabilities of network traffic analysis applications, often outperforming simpler rule-based heuristics. However, ML-based solutions remain difficult to deploy in practice. Many existing approaches only optimize the predictive performance of their models, overlooking the practical challenges of running them against network traffic in real time. This is especially problematic in the domain of traffic analysis, where the efficiency of the serving pipeline is a critical factor in determining the usability of a model. In this work, we introduce CATO, a framework that addresses this problem by jointly optimizing the predictive performance and the associated systems costs of the serving pipeline. CATO leverages recent advances in multi-objective Bayesian optimization to efficiently identify Pareto-optimal configurations, and automatically compiles end-to-end optimized serving pipelines that can be deployed in real networks. Our evaluations show that compared to popular feature optimization techniques, CATO can provide up to 3600x lower inference latency and 3.7x higher zero-loss throughput while simultaneously achieving better model performance.

Problem

Research questions and friction points this paper is trying to address.

Optimizes ML-based traffic analysis for real-time deployment

Balances predictive performance and system efficiency in pipelines

Reduces inference latency while improving model throughput

Innovation

Methods, ideas, or system contributions that make the work stand out.

Jointly optimizes predictive performance and system costs

Uses multi-objective Bayesian optimization for efficiency

Automatically compiles end-to-end optimized serving pipelines

🔎 Similar Papers

AutoFlow: An Autoencoder-based Approach for IP Flow Record Compression with Minimal Impact on Traffic Classification