🤖 AI Summary
In industrial-scale Verilog module debugging, large-context LLMs suffer from diluted fault signals and low localization/repair accuracy. To address this, we propose a semantic-slicing–driven two-stage automated repair method. Our core contributions are threefold: (1) the first semantic-guided module slicing mechanism, decomposing Verilog code into functionally cohesive fragments; (2) a fragment-level synthetic data framework enabling LLMs to precisely identify and repair defects within localized contexts; and (3) a non-intrusive edit-merging strategy preserving unrelated logic integrity. Experimental evaluation on a hardware verification benchmark shows our method achieves 77.92% pass@1 and 83.88% pass@5—significantly outperforming state-of-the-art baselines including Claude-3.7, Strider, and MEIC. These results validate both the effectiveness and generalizability of semantic slicing for hardware debugging tasks.
📝 Abstract
Debugging functional Verilog bugs consumes a significant portion of front-end design time. While Large Language Models (LLMs) have demonstrated great potential in mitigating this effort, existing LLM-based automated debugging methods underperform on industrial-scale modules. A major reason for this is bug signal dilution in long contexts, where a few bug-relevant tokens are overwhelmed by hundreds of unrelated lines, diffusing the model's attention. To address this issue, we introduce ARSP, a two-stage system that mitigates dilution via semantics-guided fragmentation. A Partition LLM splits a module into semantically tight fragments; a Repair LLM patches each fragment; edits are merged without altering unrelated logic. A synthetic data framework generates fragment-level training pairs spanning bug types, design styles, and scales to supervise both models. Experiments show that ARSP achieves 77.92% pass@1 and 83.88% pass@5, outperforming mainstream commercial LLMs including Claude-3.7 and SOTA automated Verilog debugging tools Strider and MEIC. Also, semantic partitioning improves pass@1 by 11.6% and pass@5 by 10.2% over whole-module debugging, validating the effectiveness of fragment-level scope reduction in LLM-based Verilog debugging.