CapBench: A Multi-PDK Dataset for Machine-Learning-Based Post-Layout Capacitance Extraction

📅 2026-04-13

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses the lack of high-quality, multi-node, reproducible datasets for post-layout parasitic capacitance extraction by introducing the first standardized benchmark dataset designed to support cross-PDK transfer learning and scalability studies. The dataset encompasses three technology nodes and 61,855 open-source-based 3D layout windows, with placements and routes generated using OpenROAD and ground-truth capacitance labels produced by the high-accuracy RWCap solver. It provides multiple data representations, including density maps, graph structures, and point clouds. Experimental results demonstrate that CNN-based models achieve a capacitance prediction error as low as 1.75%, while GNNs offer a 41.4× speedup in inference (with 10.2% error), and the average absolute error for total capacitance is merely 0.64%, confirming the dataset’s high fidelity and practical utility.

Technology Category

Application Category

📝 Abstract

We present CapBench, a fully reproducible, multi-PDK dataset for capacitance extraction. The dataset is derived from open-source designs, including single-core CPUs, systems-on-chip, and media accelerators. All designs are fully placed and routed using 14 independent OpenROAD flow runs spanning three technology nodes: ASAP7, NanGate45, and Sky130HD. From these layouts, we extract 61,855 3D windows across three size tiers to enable transfer learning and scalability studies. High-fidelity capacitance labels are generated using RWCap, a state-of-the-art random-walk solver, and validated against the industry-standard Raphael, achieving a mean absolute error of 0.64% for total capacitance. Each window is pre-processed into density maps, graph representations, and point clouds. We evaluate 10 machine learning architectures that illustrate dataset usage and serve as baselines, including convolutional neural networks (CNNs), point cloud transformers, and graph neural networks (GNNs). CNNs demonstrate the lowest errors (1.75%), while GNNs are up to 41.4x faster but exhibit larger errors (10.2%), illustrating a clear accuracy-speed trade-off. Code and dataset are available at https://github.com/THU-numbda/CapBench.

Problem

Research questions and friction points this paper is trying to address.

capacitance extraction

post-layout

machine learning

multi-PDK

parasitic extraction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Capacitance Extraction

Multi-PDK Dataset

Machine Learning for EDA