🤖 AI Summary
To address performance bottlenecks in mixed-height legalization—stemming from computational intensity and limited parallelism—this paper proposes an FPGA-CPU heterogeneous collaborative acceleration framework. Our method features: (1) a multi-granularity pipelined architecture that deeply pipelines cell movement and position search; (2) dynamic task partitioning and customized dataflow optimization tailored to legalization-specific characteristics; and (3) efficient FPGA implementation of geometric constraint checking and candidate position generation, while delegating global scheduling and convergence control to the CPU. Experimental evaluation on industrial benchmarks demonstrates up to 18.3× and 5.4× speedup over state-of-the-art CPU-GPU and multithreaded CPU tools, respectively, alongside 4% and 1% improvements in legalization quality. The framework exhibits strong scalability across varying design sizes and complexity.
📝 Abstract
In this work, we present FLEX, an FPGA-CPU accelerator for mixed-cell-height legalization tasks. We address challenges from the following perspectives. First, we optimize the task assignment strategy and perform an efficient task partition between FPGA and CPU to exploit their complementary strengths. Second, a multi-granularity pipelining technique is employed to accelerate the most time-consuming step, finding optimal placement position (FOP), in legalization. At last, we particularly target the computationally intensive cell shifting process in FOP, optimizing the design to align it seamlessly with the multi-granularity pipelining framework for further speedup. Experimental results show that FLEX achieves up to 18.3x and 5.4x speedups compared to state-of-the-art CPU-GPU and multi-threaded CPU legalizers with better scalability, while improving legalization quality by 4% and 1%.