ODIN-Based CPU-GPU Architecture with Replay-Driven Simulation and Emulation

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work proposes a replay-driven verification methodology to address the challenges of pre-silicon validation in chiplet-based CPU-GPU subsystems, including high complexity, non-deterministic execution, intricate protocol interactions, and prolonged integration cycles. The approach introduces, for the first time, a deterministic replay mechanism into CPU-GPU-NoC co-verification, leveraging a unified design database to enable deterministic waveform capture and replay across simulation and hardware emulation. This facilitates reliable replay of complex GPU workloads. Built upon the ODIN chiplet architecture, a configurable NoC, and multiple Xe GPU cores, the unified verification platform successfully achieved end-to-end system boot and workload execution within a single quarter, significantly accelerating the integration timeline and demonstrating the method’s scalability and efficiency.

Technology Category

Application Category

📝 Abstract

Integration of CPU and GPU technologies is a key enabler for modern AI and graphics workloads, combining control-oriented processing with massive parallel compute capability. As systems evolve toward chiplet-based architectures, pre-silicon validation of tightly coupled CPU-GPU subsystems becomes increasingly challenging due to complex validation framework setup, large design scale, high concurrency, non-deterministic execution, and intricate protocol interactions at chiplet boundaries, often resulting in long integration cycles. This paper presents a replay-driven validation methodology developed during the integration of a CPU subsystem, multiple Xe GPU cores, and a configurable Network-on-Chip (NoC) within a foundational SoC building block targeting the ODIN integrated chiplet architecture. By leveraging deterministic waveform capture and replay across both simulation and emulation using a single design database, complex GPU workloads and protocol sequences can be reproduced reliably at the system level. This approach significantly accelerates debug, improves integration confidence, and enables end-to-end system boot and workload execution within a single quarter, demonstrating the effectiveness of replay-based validation as a scalable methodology for chiplet-based systems.

Problem

Research questions and friction points this paper is trying to address.

pre-silicon validation

CPU-GPU integration

chiplet architecture

non-deterministic execution

protocol interaction

Innovation

Methods, ideas, or system contributions that make the work stand out.

replay-driven validation

CPU-GPU integration

chiplet architecture

deterministic replay

pre-silicon validation

🔎 Similar Papers

No similar papers found.

Authors to Follow