๐ค AI Summary
To address the strong template dependency and unreliable large-model outputs in RTL code auto-repair, this paper proposes a multi-agent collaborative fault localization and randomized chain-of-thought patch generation framework. Methodologically, it integrates RTL semantic understanding, waveform analysis, and a multi-agent system to design a template-free collaborative localization mechanism, while introducing randomized tree-structured reasoning to enhance search stability and exploratory capability. Key contributions include: (i) the first integration of multi-agent systems with randomized chain-of-thought reasoning for RTL repair, eliminating reliance on predefined templates; and (ii) waveform-driven defect awareness to significantly improve localization accuracy. Evaluated on the RTL-repair benchmark, our approach achieves a 90.6% defect repair rateโ45% broader coverage than conventional methodsโand a pass@5 score of 86.7%, substantially outperforming state-of-the-art techniques.
๐ Abstract
Repairing RTL bugs is crucial for hardware design and verification. Traditional automatic program repair (APR) methods define dedicated search spaces to locate and fix bugs with program synthesis. However, they heavily rely on fixed templates and can only deal with limited bugs. As an alternative, Large Language Models with the ability to understand code semantics can be explored for RTL repair. However, they suffer from unreliable outcomes due to inherent randomness and long input contexts of RTL code and waveform. To address these challenges, we propose R3A, an LLM-based automatic RTL program repair framework upon the basic model to improve reliability. R3A proposes the stochastic Tree-Of-Thoughts method to control a patch generation agent to explore a validated solution for the bug. The algorithm samples search states according to a heuristic function to balance between exploration and exploitation for a reliable outcome. Besides, R3A proposes a multi-agent fault localization method to find fault candidates as the starting points for the patch generation agent, further increasing the reliability. Experiments show R3A can fix 90.6% of bugs in the RTL-repair dataset within a given time limit, which covers 45% more bugs than traditional methods and other LLM-based approaches, while achieving an 86.7% pass@5 rate on average, showing a high reliability.