🤖 AI Summary
To address the high syntactic error rates and limited design diversity exhibited by large language models (LLMs) in generating Chisel hardware description code, this paper introduces the first domain-adaptive inference framework tailored for Chisel. Our method integrates four key components: (1) construction of a high-quality RTL-to-Chisel parallel dataset; (2) structured chain-of-thought prompting; (3) domain-specific supervised fine-tuning; and (4) test-time scaling inference. This enables joint modeling of Chisel semantics and hardware construction paradigms. We release ChiseLLM-7B and ChiseLLM-32B, which achieve absolute improvements of 18.85% and 26.32% in syntactic correctness, respectively, and enhance design diversity by 47.58% over baseline models. All code and models are fully open-sourced, enabling low-cost deployment and reproducible, verifiable LLM-supported agile hardware development.
📝 Abstract
The growing demand for Domain-Specific Architecture (DSA) has driven the development of Agile Hardware Development Methodology (AHDM). Hardware Construction Language (HCL) like Chisel offers high-level abstraction features, making it an ideal language for HCL-Based AHDM. While Large Language Models (LLMs) excel in code generation tasks, they still face challenges with Chisel generation, particularly regarding syntax correctness and design variability. Recent reasoning models have significantly enhanced code generation capabilities through test-time scaling techniques. However, we found that reasoning models without domain adaptation cannot bring substantial benefits to Chisel code generation tasks. This paper presents ChiseLLM, a solution comprising data processing and transformation, prompt-guided reasoning trace synthesis, and domain-adapted model training. We constructed high-quality datasets from public RTL code resources and guided the model to adopt structured thinking patterns through prompt enhancement methods. Experiments demonstrate that our ChiseLLM-7B and ChiseLLM-32B models improved syntax correctness by 18.85% and 26.32% respectively over base models, while increasing variability design ability by 47.58% compared to baseline reasoning models. Our datasets and models are publicly available, providing high-performance, cost-effective models for HCL-Based AHDM, and offering an effective baseline for future research. Github repository: https://github.com/observerw/ChiseLLM