🤖 AI Summary
To address the susceptibility of on-chip interconnects in System-on-Chip (SoC) designs to soft errors under radiation environments—leading to system failure—this paper proposes a high-reliability, low-latency tightly coupled communication interconnect architecture. The method introduces a hybrid fault-tolerance strategy: triple modular redundancy (TMR) for critical handshake signals, lightweight error-correcting codes (ECC) for non-critical data and address signals, and a synthesizable fully reliable crossbar switch. Experimental results demonstrate that the architecture reduces fault vulnerability from 34.85% to 0%, achieving 100% functional reliability. Compared to fine-grained TMR approaches, it incurs 1.8× less area overhead and only a 1.4× increase in timing overhead, significantly outperforming state-of-the-art solutions in both efficiency and robustness.
📝 Abstract
On-chip communication is a critical element of modern systems-on-chip (SoCs), allowing processor cores to interact with memory and peripherals. Interconnects require special care in radiation-heavy environments, as any soft error within the SoC interconnect is likely to cause a functional failure of the whole SoC. This work proposes relOBI, an extension to Open Bus Interface (OBI) combining triple modular redundancy (TMR) for critical handshake signals with error correction codes (ECC) protection on other signals for complete reliability. Implementing and testing a fully reliable crossbar shows improved reliability to injected faults from a vulnerability of 34.85 % to 0 % compared to a reference design, with an area increase of 2.6x and 1.4x timing impact. The area overhead is 1.8x lower than that reported in the literature for fine-grained triplication and voting.