VeriDispatcher: Multi-Model Dispatching through Pre-Inference Difficulty Prediction for RTL Generation Optimization

📅 2025-11-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multi-LLM collaborative RTL generation suffers from inefficiency and high cost due to indiscriminate model invocation and post-hoc filtering. Method: This paper proposes a task-difficulty-aware dynamic model scheduling framework. It establishes an RTL generation difficulty benchmark by jointly modeling syntactic correctness, structural similarity, and functional equivalence, and employs a lightweight semantic embedding classifier to predict task difficulty prior to inference—routing each task to an optimal subset of models selected from ten heterogeneous LLMs. Contribution/Results: This work presents the first systematic approach to coordinated multi-LLM optimization for RTL synthesis without full-model invocation or retrospective filtering. Experiments show that on RTLLM, the method achieves an 18% accuracy gain using only 40% of commercial API calls; on VerilogEval, it maintains baseline accuracy while reducing API usage by 25%. The framework significantly improves the cost-efficiency and scalability of automated hardware design.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) show strong performance in RTL generation, but different models excel on different tasks because of architecture and training differences. Prior work mainly prompts or finetunes a single model. What remains not well studied is how to coordinate multiple different LLMs so they jointly improve RTL quality while also reducing cost, instead of running all models and choosing the best output. We define this as the multi-LLM RTL generation problem. We propose VeriDispatcher, a multi-LLM RTL generation framework that dispatches each RTL task to suitable LLMs based on pre-inference difficulty prediction. For each model, we train a compact classifier over semantic embeddings of task descriptions, using difficulty scores derived from benchmark variants that combine syntax, structural similarity, and functional correctness. At inference, VeriDispatcher uses these predictors to route tasks to a selected subset of LLMs. Across 10 diverse LLMs on RTLLM and VerilogEval, VeriDispatcher achieves up to 18% accuracy improvement on RTLLM using only 40% of commercial calls, and on VerilogEval maintains accuracy while reducing commercial usage by 25%, enabling cost-effective, high-quality LLM deployment in hardware design automation.
Problem

Research questions and friction points this paper is trying to address.

Optimizes RTL generation by coordinating multiple LLMs efficiently
Predicts task difficulty to dispatch to suitable models pre-inference
Reduces commercial API costs while maintaining or improving accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-LLM dispatching via pre-inference difficulty prediction
Compact classifiers trained on semantic embeddings for task routing
Reduces commercial API calls while maintaining or improving accuracy
🔎 Similar Papers
No similar papers found.