Scalable Thermodynamic Second-order Optimization

📅 2025-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the impracticality of second-order optimization algorithms—such as K-FAC—in AI training on digital hardware due to prohibitive matrix inversion overhead, this paper proposes the first co-optimization framework for K-FAC tailored to thermodynamic computing hardware. Methodologically, it maps the Kronecker-factored curvature approximation onto physical analog circuits, enabling in-situ analog simulation of matrix inversion and thereby eliminating explicit digital inversion. It further introduces a hardware-aware, noise-robust update scheme, with theoretical proof of second-order convergence preservation under strong quantization noise; acceleration scales superlinearly with neuron count (n). Numerical experiments confirm robust optimization performance under realistic noise conditions. Hardware-aware simulations—calibrated with measured device parameters—predict order-of-magnitude training speedups for vision and graph learning tasks.

Technology Category

Application Category

📝 Abstract
Many hardware proposals have aimed to accelerate inference in AI workloads. Less attention has been paid to hardware acceleration of training, despite the enormous societal impact of rapid training of AI models. Physics-based computers, such as thermodynamic computers, offer an efficient means to solve key primitives in AI training algorithms. Optimizers that normally would be computationally out-of-reach (e.g., due to expensive matrix inversions) on digital hardware could be unlocked with physics-based hardware. In this work, we propose a scalable algorithm for employing thermodynamic computers to accelerate a popular second-order optimizer called Kronecker-factored approximate curvature (K-FAC). Our asymptotic complexity analysis predicts increasing advantage with our algorithm as $n$, the number of neurons per layer, increases. Numerical experiments show that even under significant quantization noise, the benefits of second-order optimization can be preserved. Finally, we predict substantial speedups for large-scale vision and graph problems based on realistic hardware characteristics.
Problem

Research questions and friction points this paper is trying to address.

Accelerate AI training using thermodynamic computers
Enable efficient second-order optimization on physics-based hardware
Scale second-order optimizers for large AI models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Thermodynamic computers accelerate AI training
Scalable algorithm for second-order optimization
Preserves optimization benefits under quantization noise
🔎 Similar Papers
No similar papers found.