How Can Reinforcement Learning Achieve Expert-level Placement?

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

230K/year
🤖 AI Summary
This work addresses the persistent gap between automated and expert-level chip placement by proposing a novel reinforcement learning paradigm that moves beyond conventional wirelength-centric optimization. Instead of modeling complex placement processes, the method reverse-engineers plausible placement trajectories from a single expert-final layout and leverages these as demonstrations or preference signals to train a reward model capable of capturing the implicit objectives underlying expert decisions. Requiring only one expert design for training, the approach demonstrates strong generalization to unseen chip architectures. Experimental results show that the proposed framework substantially narrows the performance gap between automated placement and human expert quality, offering a promising new direction for physical design automation in integrated circuit layout.
📝 Abstract
Chip placement is a critical step in physical design. While reinforcement learning (RL)-based methods have recently emerged, their training primarily focuses on wirelength optimization, and therefore often fail to achieve expert-quality layouts. We identify the reward design as the primary cause for the performance gap with experts, and instead of formalizing intricate processes, we circumvent this by directly learning from expert layouts to derive a reward model. Our approach starts from the final expert layouts to infer step-by-step expert trajectories. Using these trajectories as demonstrations or preferences, we train a model that captures the latent implicit rewards in expert results. Experiments show that our framework can efficiently learn from even a single design and generalize well to unseen cases.
Problem

Research questions and friction points this paper is trying to address.

chip placement
reinforcement learning
reward design
expert layouts
physical design
Innovation

Methods, ideas, or system contributions that make the work stand out.

reward modeling
expert demonstrations
chip placement
reinforcement learning
trajectory inference
🔎 Similar Papers
No similar papers found.
R
Ruo-Tong Chen
State Key Laboratory of Novel Software Technology, Nanjing University, China; School of Artificial Intelligence, Nanjing University, China; Huawei Noah’s Ark Lab, China
Ke Xue
Ke Xue
Nanjing University
Black-Box OptimizationMachine Learning
Chengrui Gao
Chengrui Gao
PhD student, Nanjing University
Learning to OptimizeCombinatorial OptimizationChip PlacementReinforcement Learning
Y
Yunqi Shi
State Key Laboratory of Novel Software Technology, Nanjing University, China; School of Artificial Intelligence, Nanjing University, China
Tian Xu
Tian Xu
Nanjing University
Reinforcement Learning
P
Peng Xie
State Key Laboratory of Novel Software Technology, Nanjing University, China; School of Artificial Intelligence, Nanjing University, China
Siyuan Xu
Siyuan Xu
Huawei Noah's Ark Lab
Physical DesignApproximate ComputingHigh-level SynthesisFPGAMachine Learning
M
Mingxuan Yuan
Huawei Noah’s Ark Lab, China
Chao Qian
Chao Qian
Nanjing University
Artificial intelligenceevolutionary algorithmsmachine learning
Zhi-Hua Zhou
Zhi-Hua Zhou
Nanjing University
Artificial IntelligenceMachine LearningData Mining