MMCircuitEval: A Comprehensive Multimodal Circuit-Focused Benchmark for Evaluating LLMs

📅 2025-07-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The electronic design automation (EDA) community lacks a dedicated multimodal large language model (MLLM) evaluation benchmark tailored to circuit design. Method: This paper introduces MMCircuitEval, the first domain-specific multimodal benchmark for EDA, covering frontend and backend tasks across digital and analog circuits. It organizes evaluations hierarchically along four dimensions—design phase, circuit type, capability category (knowledge, understanding, reasoning, computation), and difficulty level—and curates high-quality data from textbooks, problem sets, datasheets, and real-world engineering documents, validated by domain experts. Contribution/Results: Experimental evaluation reveals significant performance bottlenecks of current MLLMs on backend design and complex computational tasks, underscoring the necessity of domain-adapted training data and modeling paradigms. MMCircuitEval establishes a foundational evaluation resource and technical roadmap for developing EDA-oriented MLLMs.

Technology Category

Application Category

📝 Abstract
The emergence of multimodal large language models (MLLMs) presents promising opportunities for automation and enhancement in Electronic Design Automation (EDA). However, comprehensively evaluating these models in circuit design remains challenging due to the narrow scope of existing benchmarks. To bridge this gap, we introduce MMCircuitEval, the first multimodal benchmark specifically designed to assess MLLM performance comprehensively across diverse EDA tasks. MMCircuitEval comprises 3614 meticulously curated question-answer (QA) pairs spanning digital and analog circuits across critical EDA stages - ranging from general knowledge and specifications to front-end and back-end design. Derived from textbooks, technical question banks, datasheets, and real-world documentation, each QA pair undergoes rigorous expert review for accuracy and relevance. Our benchmark uniquely categorizes questions by design stage, circuit type, tested abilities (knowledge, comprehension, reasoning, computation), and difficulty level, enabling detailed analysis of model capabilities and limitations. Extensive evaluations reveal significant performance gaps among existing LLMs, particularly in back-end design and complex computations, highlighting the critical need for targeted training datasets and modeling approaches. MMCircuitEval provides a foundational resource for advancing MLLMs in EDA, facilitating their integration into real-world circuit design workflows. Our benchmark is available at https://github.com/cure-lab/MMCircuitEval.
Problem

Research questions and friction points this paper is trying to address.

Evaluating MLLMs in circuit design lacks comprehensive benchmarks
Assessing MLLM performance across diverse EDA tasks is challenging
Existing LLMs show gaps in back-end design and computations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal benchmark for EDA tasks
3614 expert-reviewed QA pairs
Categorizes questions by design stage
🔎 Similar Papers
No similar papers found.
C
Chenchen Zhao
Department of Computer Science and Engineering, The Chinese University of Hong Kong
Z
Zhengyuan Shi
Department of Computer Science and Engineering, The Chinese University of Hong Kong
Xiangyu Wen
Xiangyu Wen
Computer Science and Engineering, CUHK
AI SecurityLLM Reasoning
Chengjie Liu
Chengjie Liu
Nanjing University
EDA for analog circuits
Y
Yi Liu
Department of Computer Science and Engineering, The Chinese University of Hong Kong
Yunhao Zhou
Yunhao Zhou
Shanghai Jiao Tong University
EDAGNNLLM
Yuxiang Zhao
Yuxiang Zhao
Shanghai Jiao Tong University
text-to-speechartificial intelligencedeepfake detection
H
Hefei Feng
School of Intergrated Circuits, Southeast University
Y
Yinan Zhu
National Center of Technology Innovation for EDA
G
Gwok-Waa Wan
National Center of Technology Innovation for EDA
X
Xin Cheng
School of Computer Science and Engineering, Southeast University
Weiyu Chen
Weiyu Chen
PhD Student, Hong Kong University of Science and Technology (HKUST)
Efficient LLMDiffusion ModelMulti-Objective Optimization
Y
Yongqi Fu
School of Intergrated Circuits, Southeast University
C
Chujie Chen
School of Computer Science and Technology, University of Chinese Academy of Sciences
Chenhao Xue
Chenhao Xue
School of Integrated Circuits, Peking University
AIcomputer architectureEDA
Guangyu Sun
Guangyu Sun
School of Integrated Circuits, Peking University
Computer ArchitectureDesign AutomationEmerging Memory
Y
Ying Wang
School of Computer Science and Technology, University of Chinese Academy of Sciences
Yibo Lin
Yibo Lin
Assistant Professor at Peking University
Deep learningVLSI CADdesign for manufacturability
J
Jun Yang
School of Intergrated Circuits, Southeast University
N
Ning Xu
School of Computer Science and Engineering, Southeast University
X
Xi Wang
School of Intergrated Circuits, Southeast University
Q
Qiang Xu
Department of Computer Science and Engineering, The Chinese University of Hong Kong