LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation

📅 2026-02-18

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the challenge of hardware verification, which relies on costly emulator feedback and struggles to support online reinforcement learning. The authors propose an execution-aware offline agent learning framework that formulates verification as a memoryless state-transition process guided by a deterministic evaluator. By integrating curated verification data, policy-aware synthetic data generation, and worst-state-prioritized sampling, the framework enables efficient and scalable learning under strict constraint enforcement. Remarkably, using only a 4B-parameter model, the approach achieves a 69.2% coverage pass rate in agent evaluation—surpassing the teacher model by 5.3% and matching the performance of models ten times larger—thereby substantially alleviating the feedback scarcity bottleneck inherent in hardware verification.

Technology Category

Application Category

📝 Abstract

Execution-aware LLM agents offer a promising paradigm for learning from tool feedback, but such feedback is often expensive and slow to obtain, making online reinforcement learning (RL) impractical. High-coverage hardware verification exemplifies this challenge due to its reliance on industrial simulators and non-differentiable execution signals. We propose LLM4Cov, an offline agent-learning framework that models verification as memoryless state transitions guided by deterministic evaluators. Building on this formulation, we introduce execution-validated data curation, policy-aware agentic data synthesis, and worst-state-prioritized sampling to enable scalable learning under execution constraints. We further curate a reality-aligned benchmark adapted from an existing verification suite through a revised evaluation protocol. Using the proposed pipeline, a compact 4B-parameter model achieves 69.2% coverage pass rate under agentic evaluation, outperforming its teacher by 5.3% and demonstrating competitive performance against models an order of magnitude larger.

Problem

Research questions and friction points this paper is trying to address.

hardware verification

testbench generation

execution feedback

coverage

LLM agents

Innovation

Methods, ideas, or system contributions that make the work stand out.

execution-aware learning

offline agent learning

high-coverage testbench generation