🤖 AI Summary
Existing EDA datasets suffer from limited scale, single-modality representation, and misalignment across heterogeneous design abstractions, hindering the development of AI-driven circuit design methodologies. To address these limitations, ForgeEDA introduces the first open-source, end-to-end multimodal IC design dataset, unifying diverse representations—including RTL code, mapped netlists, AIG graphs, and placement netlists—under a standardized schema. It is constructed via a robust end-to-end toolchain encompassing HDL parsing, logic synthesis, formal graph modeling, and physical design. This dataset fills a critical gap in large-scale, highly aligned, multi-granularity circuit data. Empirical evaluation reveals significant performance bottlenecks of mainstream EDA algorithms in PPA (Power, Performance, Area) optimization. Models trained on ForgeEDA demonstrate improved prediction accuracy and enhanced cross-task transferability, thereby advancing the AI-EDA methodology.
📝 Abstract
We introduce ForgeEDA, an open-source comprehensive circuit dataset across various categories. ForgeEDA includes diverse circuit representations such as Register Transfer Level (RTL) code, Post-mapping (PM) netlists, And-Inverter Graphs (AIGs), and placed netlists, enabling comprehensive analysis and development. We demonstrate ForgeEDA's utility by benchmarking state-of-the-art EDA algorithms on critical tasks such as Power, Performance, and Area (PPA) optimization, highlighting its ability to expose performance gaps and drive advancements. Additionally, ForgeEDA's scale and diversity facilitate the training of AI models for EDA tasks, demonstrating its potential to improve model performance and generalization. By addressing limitations in existing datasets, ForgeEDA aims to catalyze breakthroughs in modern IC design and support the next generation of innovations in EDA.