LLM-Sketch: Enhancing Network Sketches with LLM

📅 2025-02-11

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Traditional sketches suffer from low accuracy in dynamic network flow mining, while ML-based optimization methods exhibit poor adaptability and high training overhead. Method: This paper proposes a novel two-tier flow sketch architecture integrating a large language model (LLM). It pioneers the use of lightweight fine-tuned LLMs in sketch design, leveraging non-ID packet-header fields (e.g., TTL, IP identification) for joint flow feature modeling. A decoupled two-tier structure is introduced: an upper tier classifies flows into heavy vs. light categories, while a lower tier performs regression-based flow size estimation—balancing memory efficiency and dynamic adaptability. Results: On three representative flow analysis tasks, the method achieves 7.5× higher accuracy than state-of-the-art approaches, reduces training cost by an order of magnitude, and significantly improves robustness to abrupt traffic distribution shifts.

Technology Category

Application Category

📝 Abstract

Network stream mining is fundamental to many network operations. Sketches, as compact data structures that offer low memory overhead with bounded accuracy, have emerged as a promising solution for network stream mining. Recent studies attempt to optimize sketches using machine learning; however, these approaches face the challenges of lacking adaptivity to dynamic networks and incurring high training costs. In this paper, we propose LLM-Sketch, based on the insight that fields beyond the flow IDs in packet headers can also help infer flow sizes. By using a two-tier data structure and separately recording large and small flows, LLM-Sketch improves accuracy while minimizing memory usage. Furthermore, it leverages fine-tuned large language models (LLMs) to reliably estimate flow sizes. We evaluate LLM-Sketch on three representative tasks, and the results demonstrate that LLM-Sketch outperforms state-of-the-art methods by achieving a $7.5 imes$ accuracy improvement.

Problem

Research questions and friction points this paper is trying to address.

Enhancing network stream mining accuracy

Reducing memory usage in network sketches

Adapting to dynamic networks efficiently

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-tier data structure

Fine-tuned large language models

Separate large and small flows

🔎 Similar Papers

LLM-Enhanced User-Item Interactions: Leveraging Edge Information for Optimized Recommendations