TraceCodec: A Compiler-Backed Neural Codec for Stateful Multi-Flow Network Traffic Traces

📅 2026-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing network traffic generation methods struggle to accurately model multi-flow interactions and TCP state machines because they directly decode raw packet fields, conflating behavioral semantics with protocol constraints and relying on heuristic post-hoc repairs. This work proposes TraceCodec, the first framework to integrate a neural codec with a deterministic protocol compiler in a collaborative architecture. By shifting the generation space from raw packet headers to a structured latent space of packet actions—each comprising a timestamp, an explicit flow slot, and transmission cues—and modeling sequences of continuous latent variables, TraceCodec decouples generative logic from protocol implementation. This enables synthesis of high-fidelity PCAP traces without requiring post-generation correction. Evaluated on the CICIDS2017 Monday dataset, TraceCodec achieves packet count, protocol composition, and flow size errors below 0.03%, significantly outperforming baselines in flow count accuracy, TCP state fidelity, and preservation of multi-flow interleaving structures.
📝 Abstract
Critical networking workflows require high-fidelity packet captures (PCAPs) for testing, security analysis, and protocol validation, not just statistical flow-level summaries. Recent packet generators have demonstrated protocol-constrained PCAP synthesis, but they universally decode directly to raw packet fields. That interface entangles learned behavioral choices with deterministic protocol consequences, which forces packet realization to depend on post-hoc heuristic repair. We identify this decode interface as the fundamental bottleneck and present TraceCodec, a state-aware neural codec for stateful multi-flow traces. TraceCodec lifts each packet into a timed packet action with explicit flow slots and transport cues, then learns a continuous per-packet latent. A deterministic compiler lowers decoded actions back to PCAPs, owning endpoint assignment, TCP state, legality constraints, and packet rendering. The latent layer exposes a generator-facing sequence space, so downstream traffic models can operate on packet-action latents rather than raw header fields. On CICIDS2017 Monday, TraceCodec matches packet count, protocol composition, and flow population to within 0.03%. Raw-field baselines under the same non-repair policy distort flow counts and TCP state by orders of magnitude. Structural diagnostics show that TraceCodec preserves TCP state transitions and multi-flow interleaving that raw-field decoders fragment. This work establishes a new foundation for high-fidelity packet-trace generation.
Problem

Research questions and friction points this paper is trying to address.

packet trace generation
stateful network traffic
protocol-constrained synthesis
TCP state preservation
multi-flow interleaving
Innovation

Methods, ideas, or system contributions that make the work stand out.

neural codec
stateful traffic generation
packet-action abstraction
compiler-backed synthesis
high-fidelity PCAP