ODIN-Based CPU-GPU Architecture with Replay-Driven Simulation and Emulation

πŸ“… 2026-03-17
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes a replay-driven verification methodology to address the challenges of pre-silicon validation in chiplet-based CPU-GPU subsystems, including high complexity, non-deterministic execution, intricate protocol interactions, and prolonged integration cycles. The approach introduces, for the first time, a deterministic replay mechanism into CPU-GPU-NoC co-verification, leveraging a unified design database to enable deterministic waveform capture and replay across simulation and hardware emulation. This facilitates reliable replay of complex GPU workloads. Built upon the ODIN chiplet architecture, a configurable NoC, and multiple Xe GPU cores, the unified verification platform successfully achieved end-to-end system boot and workload execution within a single quarter, significantly accelerating the integration timeline and demonstrating the method’s scalability and efficiency.

Technology Category

Application Category

πŸ“ Abstract
Integration of CPU and GPU technologies is a key enabler for modern AI and graphics workloads, combining control-oriented processing with massive parallel compute capability. As systems evolve toward chiplet-based architectures, pre-silicon validation of tightly coupled CPU-GPU subsystems becomes increasingly challenging due to complex validation framework setup, large design scale, high concurrency, non-deterministic execution, and intricate protocol interactions at chiplet boundaries, often resulting in long integration cycles. This paper presents a replay-driven validation methodology developed during the integration of a CPU subsystem, multiple Xe GPU cores, and a configurable Network-on-Chip (NoC) within a foundational SoC building block targeting the ODIN integrated chiplet architecture. By leveraging deterministic waveform capture and replay across both simulation and emulation using a single design database, complex GPU workloads and protocol sequences can be reproduced reliably at the system level. This approach significantly accelerates debug, improves integration confidence, and enables end-to-end system boot and workload execution within a single quarter, demonstrating the effectiveness of replay-based validation as a scalable methodology for chiplet-based systems.
Problem

Research questions and friction points this paper is trying to address.

pre-silicon validation
CPU-GPU integration
chiplet architecture
non-deterministic execution
protocol interaction
Innovation

Methods, ideas, or system contributions that make the work stand out.

replay-driven validation
CPU-GPU integration
chiplet architecture
deterministic replay
pre-silicon validation
πŸ”Ž Similar Papers
No similar papers found.
N
Nij Dorairaj
Intel Corporation
D
Debabrata Chatterjee
Intel Corporation
Hong Wang
Hong Wang
Intel Corporation
Intel FellowComputer ArchitectureMicroarchitectureCPU/xPU IP/SoC Silicon Development
Hong Jiang
Hong Jiang
University of Texas at Arlington
computer sciencecomputer architecturefile and storage systemscloud computingparallel and distributed processing
A
Alankar Saxena
Intel Corporation
A
Altug Koker
Intel Corporation
T
Thiam Ern Lim
Intel Corporation
C
Cathrane Teoh
Intel Corporation
C
Chuan Yin Loo
Intel Corporation
B
Bishara Shomar
Intel Corporation/Nvidia Corp
A
Anthony Lester
Synopsys Inc