🤖 AI Summary
Existing PCIe TLP trace generation methods ignore protocol-level constraints—such as timing requirements and causal consistency—resulting in noncompliant and practically unusable traces. This work pioneers modeling TLP trace generation as a generative AI task under hardware-imposed constraints, introducing Phantom: a protocol-aware framework that jointly incorporates PCIe transaction-layer semantics (e.g., TLP ordering and causal consistency), structured prompt engineering, and a protocol-aware loss function to enable constrained sequence modeling and decoding. Evaluated on real-world NIC deployments, Phantom generates large-scale, protocol-compliant TLP traces, achieving a 1000× improvement in task-specific metrics and a 2.19× reduction in Fréchet Inception Distance (FID) over unconstrained baselines. Crucially, Phantom unifies protocol compliance with statistical fidelity, delivering the first deployable generative solution for PCIe peripheral prototyping and optimization.
📝 Abstract
Peripheral Component Interconnect Express (PCIe) is the de facto interconnect standard for high-speed peripherals and CPUs. Prototyping and optimizing PCIe devices for emerging scenarios is an ongoing challenge. Since Transaction Layer Packets (TLPs) capture device-CPU interactions, it is crucial to analyze and generate realistic TLP traces for effective device design and optimization. Generative AI offers a promising approach for creating intricate, custom TLP traces necessary for PCIe hardware and software development. However, existing models often generate impractical traces due to the absence of PCIe-specific constraints, such as TLP ordering and causality. This paper presents Phantom, the first framework that treats TLP trace generation as a generative AI problem while incorporating PCIe-specific constraints. We validate Phantom's effectiveness by generating TLP traces for an actual PCIe network interface card. Experimental results show that Phantom produces practical, large-scale TLP traces, significantly outperforming existing models, with improvements of up to 1000$ imes$ in task-specific metrics and up to 2.19$ imes$ in Frechet Inception Distance (FID) compared to backbone-only methods.