🤖 AI Summary
Conventional von Neumann architectures and existing Ising machines—including simulated annealing and digital/in-memory computing approaches—suffer from slow convergence, poor spin initialization, and insufficient algorithm–hardware co-design when tackling NP-hard combinatorial optimization problems.
Method: This work proposes a hardware–software co-designed in-memory computing (CiM) Ising machine: (i) a novel back-end-of-line (BEOL)-integrated FeFET-based CiM chip (180 nm CMOS + ferroelectric capacitors, 32×256 array) enabling native vector–matrix and vector–matrix–vector operations; and (ii) a synergistic framework combining attention-driven, topology-aware global spin initialization with a lightweight analog bifurcation algorithm—first natively accelerated on CiM hardware.
Results: Evaluated on million-edge Max-Cut instances, the system reduces iteration count by 80%, achieves 175.9× speedup over GPU baselines, and consistently delivers superior solution quality.
📝 Abstract
Computationally hard combinatorial optimization problems are pervasive in science and engineering, yet their NP-hard nature renders them increasingly inefficient to solve on conventional von Neumann architectures as problem size grows. Ising machines implemented using dynamical, digital and compute-in-memory (CiM) approaches offer a promising alternative, but often suffer from poor initialization and a fundamental trade-off between algorithmic performance and hardware efficiency. Hardware-friendly schemes such as simulated annealing converge slowly, whereas faster algorithms, including simulated bifurcation, are difficult to implement efficiently in CiM hardware, limiting both convergence speed and solution quality. To address these limitations, here we present a ferroelectric field-effect transistor (FeFET)-based CiM Ising framework that tightly co-designs algorithms and hardware to efficiently solve large-scale combinatorial optimization problems. The proposed approach employs a two-step algorithmic flow: an attention-inspired initialization that exploits global spin topology and reduces the required iterations by up to 80%, followed by a lightweight simulated bifurcation algorithm specifically tailored for CiM implementation. To natively accelerate the core vector-matrix and vector-matrix-vector operations in both steps, we fabricate a 32x256 FeFET CiM chip using ferroelectric capacitors integrated at the back end of line of a 180-nm CMOS platform. Across Max-Cut instances with up to 100,000 nodes, the proposed hardware-software co-designed solver achieves up to a 175.9x speedup over a GPU-based simulated bifurcation implementation while consistently delivering superior solution quality.